ADR 0002: Order-independent tests with opt-in parallelism¶
Status¶
Accepted. 2026-06-28.
Context¶
The test suite is subprocess-bound: it spawns real git, hg, and svn
processes and writes working copies to disk. Wall-time lives in process
spawning and filesystem I/O, not Python, so the single largest lever for a
faster suite is running independent tests across CPU cores.
Parallel execution (and, equivalently, shuffled execution) is only safe when the suite is order-independent: every test must pass regardless of what ran before it. The suite was not. Two pieces of shared mutable state passed in the fixed collection order but failed once tests were reordered:
libvcs.url.registry.VCSRegistrystored itsparser_mapin a class variable, so constructing any second registry mutated the module-levelregistrysingleton — whichever test built a custom registry first changed URL detection for every later test.The
git_repo/hg_repopytest fixtures handed the first consumer a live handle to a session-cached master checkout. The first test to use the fixture could mutate that cache (add a remote, switch a branch), leaking state into every later test that copied it.
These were invisible under the default order and surfaced only when a
shuffled run (pytest-randomly) or a parallel run (pytest-xdist) changed
which tests ran together and in what sequence. They are also genuine bugs for
downstream consumers (e.g. vcspull) that build on VCSRegistry and the
fixtures, independent of how the tests run.
Decision¶
Two coupled commitments — the invariant is the prerequisite for the mechanism.
Tests are order-independent¶
No shared mutable state may leak across tests. Coupling is fixed at the source, never hidden behind a fixed order or a co-locating scheduler:
VCSRegistry.parser_mapis a per-instance attribute; each registry is independent and the globalregistryis never mutated by constructing another.The
git_repo/hg_repo/svn_repofixtures treat the master checkout as a pristine, read-only cache and hand every consumer — including the first — its own copy. A test may mutate its checkout freely without affecting any other.
The expectation is documented in Workflow so contributors keep new tests
order-independent, with a shuffled-run check (uv run --with pytest-randomly py.test -p randomly).
Parallelism is opt-in, not the default¶
pytest-xdist is a development dependency exposed via just test-parallel
(uv run py.test -n auto). The default uv run py.test stays serial.
The worker count is not hardcoded.
-n autoadapts to the machine; the operator caps it when needed. A subprocess-bound suite oversubscribes on high-core machines, so a fixed count committed toaddoptswould be wrong somewhere.The default scheduler (
load) is used. A co-locating scheduler (loadfile) was trialled as a workaround for the order-coupling, but once the coupling is fixed it is unnecessary, so it is not committed.
Alternatives considered¶
Approach |
order-safe |
default unchanged |
portable |
chosen |
|---|---|---|---|---|
Fix coupling at source + opt-in |
yes |
yes |
yes |
yes |
Keep |
masks it |
yes |
yes |
no |
|
needs fix |
no |
no |
no |
Hardcode a worker count (e.g. |
needs fix |
no |
no |
no |
|
detects only |
yes |
yes |
deferred |
The decisive point: a co-locating scheduler only hides order-coupling, and it
was measured to be no faster than the default scheduler once the coupling was
fixed — so the coupling is fixed and the workaround dropped. pytest-randomly
detects order-coupling but does not parallelize; it is used transiently for
checks (uv run --with pytest-randomly) rather than committed.
Consequences¶
Positive¶
The suite passes under serial, shuffled, and parallel (
-n auto) execution.Two real isolation bugs in shipped code (
VCSRegistry, the repo fixtures) are fixed, benefiting downstream consumers regardless of parallelism.A faster opt-in run is available without changing the stable default.
Tradeoffs¶
Parallel wall-time varies with machine load, and
-n autocan oversubscribe a subprocess-bound suite on high-core machines — the operator picks the worker count.Contributors must keep new tests order-independent (reset global state in teardown; isolate per-test resources).
Risks¶
A future order-coupling could reintroduce flakiness that appears only under parallel or shuffled runs. Mitigation: the order-independence expectation is documented with a shuffled-run check, so the regression is reproducible.
Prior art¶
pytest-xdistprovides theload,loadscope, andloadfileschedulers; its guidance is that tests must be independent forloadto be safe.pytest-randomlyrandomizes order specifically to surface inter-test coupling.The broader convention across test suites: parallel execution requires order-independent tests, and shared mutable state (global registries, cached fixtures handed out by reference) is the usual cause of order-dependent failures.