You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tracking the user-reported testing regressions and performance complaints that emerged after the project-structure / [test-by-project] rework, plus the older follow-ups that ride alongside them.
The work is grouped into four problem areas. Each item below points at the existing user-filed issue (or a new engineering issue where there is no good match) so progress is visible in one place.
Problem areas (what users are reporting)
Test discovery is much slower than CLI pytest.
Multiple users on large parametrized suites — 30k tests take 40s in the Test Explorer vs ~2s on the CLI; 328k tests take 66s vs ~10s. Profiled to O(n²) list scans plus an oversized JSON payload.
Test tree is rebuilt from scratch on every change.
Saving any .py file re-discovers the whole workspace and wipes the existing tree. While re-discovery is in flight, users can't re-run or debug a test because the items have been cleared. "Debug: Restart" breaks for the same reason.
Run / debug pipeline regressions.
Tests appear as "skipped" even though they ran; debug runs lose results because the result pipe is cancelled the moment the subprocess exits; pytest-subtests failures get reported as success; the env selected via the Python Environments API is not always honored.
Hard discovery failures still open.
Smaller correctness bugs in the pytest plugin (HIDDEN_PARAM, pipe writer broken by mock.patch("builtins.open")).
Drain the result pipe before disposing it on cancellation.startRunResultNamedPipe in common/utils.ts calls disposable.dispose() from onCancellationRequested, which closes the reader while data is still buffered. The debug path triggers cancellation as soon as the debug session terminates, so any results not yet drained are lost. — PyTest exits when subprocess finishes execution when ran through Testing UI #25872
Surface an error payload when the env-extension subprocess exits non-zero. Legacy path already does this in pytestExecutionAdapter.ts; the env-extension path resolves the deferred silently. With no results and no error, every test defaults to skipped. — Python skip all Tests #25892
Fix pytest-subtests dedup in pytest_report_teststatus. The collected_tests_so_far set in vscode_pytest/__init__.py keys on nodeid and drops every report after the first. pytest-subtests emits multiple call-phase reports for the same nodeid, so the first one wins and any later failure (or correction) is silently lost. Community workarounds exist on the issue. — failing pytest tests are marked as “success” when some of the subtests succeeded #25824
pytest.HIDDEN_PARAM discovery crash.process_parameterized_test does parent_part, parameterized_section = test_node["name"].split("[", 1), which raises ValueError when pytest emits a node id without [...] (i.e. when HIDDEN_PARAM is used). One-line guard. — pytest.HIDDEN_PARAM Causes Discovery Failure #25795
Telemetry step 2 (Python-side). Add a meta block to the vscode_pytest discovery payload with subprocessDurationMs, pluginDurationMs, payloadBytes, parametrizedTestCount, and fold it into the new UNITTEST.DISCOVERY.DONE event so we can split slow discoveries into "subprocess vs plugin vs JS overhead". — new issue
Verify useEnvExtension() actually fires in VS Code 1.106+. Reporter pins a regression to that VS Code release. Telemetry-first: add an envSource field to the discovery/run events, look at the live data, then fix. — pytest discovery does not use specified environment #25718
Baseline telemetry has already been wired up (TS-side). Each fix above has a corresponding metric so dashboards can verify the change actually moves the needle (and catch any unintended regressions):
Area
Primary metric
What "fixed" looks like
Discovery perf
UNITTEST.DISCOVERY.DONE.totalDurationMs p50/p90 sliced by testCount bucket × mode
Large-suite p90 drops by an order of magnitude; mode='project' converges to mode='legacy'.
non-test share drops to ~0%; rebuiltFromScratch=false share grows as incremental updates land.
Run / debug pipeline
UNITTEST.RUN.DONE.missingCount > 0 share; pipeClosedEarly share; failureCategory distribution
missingCount>0 and pipeClosedEarly shares drop to near-0 on mode='project' and debugging=true.
Discovery hard failures
UNITTEST.DISCOVERY.DONE.failureCategory distribution
Each individual fix shrinks its corresponding bucket.
Per-area success criteria are checked off as each fix ships and the telemetry confirms the change.
Out of scope (deliberately, for now)
Per-test or per-file names in telemetry — privacy-sensitive, not needed for the questions above.
True added / removed counts in UNITTEST.TREE.UPDATE — needs an O(n) set diff per discovery; revisit only if beforeCount/afterCount + rebuiltFromScratch aren't enough signal.
Migrating off named pipes entirely — out of scope for this workstream.
Tracking the user-reported testing regressions and performance complaints that emerged after the project-structure /
[test-by-project]rework, plus the older follow-ups that ride alongside them.The work is grouped into four problem areas. Each item below points at the existing user-filed issue (or a new engineering issue where there is no good match) so progress is visible in one place.
Problem areas (what users are reporting)
Test discovery is much slower than CLI pytest.
Multiple users on large parametrized suites — 30k tests take 40s in the Test Explorer vs ~2s on the CLI; 328k tests take 66s vs ~10s. Profiled to
O(n²)list scans plus an oversized JSON payload.Test tree is rebuilt from scratch on every change.
Saving any
.pyfile re-discovers the whole workspace and wipes the existing tree. While re-discovery is in flight, users can't re-run or debug a test because the items have been cleared. "Debug: Restart" breaks for the same reason.Run / debug pipeline regressions.
Tests appear as "skipped" even though they ran; debug runs lose results because the result pipe is cancelled the moment the subprocess exits;
pytest-subtestsfailures get reported as success; the env selected via the Python Environments API is not always honored.Hard discovery failures still open.
Smaller correctness bugs in the pytest plugin (
HIDDEN_PARAM, pipe writer broken bymock.patch("builtins.open")).Order of execution (impact × effort)
python.testing.autoTestDiscoverOnSavePatterndefault so saving a non-test file does not trigger a full re-discovery. Default today is**/*.py(verified inpackage.json); should match test files only. —python.testing.autoTestDiscoverOnSaveEnabledis running test discovery on change to any file (not just test files) #25866vscode_pytest.process_parameterized_testandbuild_test_treeuseif x not in childrenagainst plain lists, which isO(n²)as parametrize cases pile up under one function. Reporter on Extremely slow test discovery when using pytest for large test suite #25973 has a profile + a candidate PR. — Extremely slow test discovery when using pytest for large test suite #25973startRunResultNamedPipeincommon/utils.tscallsdisposable.dispose()fromonCancellationRequested, which closes the reader while data is still buffered. The debug path triggers cancellation as soon as the debug session terminates, so any results not yet drained are lost. — PyTest exits when subprocess finishes execution when ran through Testing UI #25872pytestExecutionAdapter.ts; the env-extension path resolves the deferred silently. With no results and no error, every test defaults to skipped. — Python skip all Tests #25892pytest-subtestsdedup inpytest_report_teststatus. Thecollected_tests_so_farset invscode_pytest/__init__.pykeys onnodeidand drops every report after the first.pytest-subtestsemits multiplecall-phase reports for the samenodeid, so the first one wins and any later failure (or correction) is silently lost. Community workarounds exist on the issue. — failingpytesttests are marked as “success” when some of the subtests succeeded #25824populateTestTree. TodayprocessDiscoverydoestestItemIndex.clear()andpopulateTestTreealways rebuilds. Diff old vs new test trees and only insert / remove / update changed items. Largest item; biggest user-visible fix; should land last. — After latest update: Testing tree of python tests reloads completely rather than just update #25822pytest.HIDDEN_PARAMdiscovery crash.process_parameterized_testdoesparent_part, parameterized_section = test_node["name"].split("[", 1), which raisesValueErrorwhen pytest emits a node id without[...](i.e. whenHIDDEN_PARAMis used). One-line guard. — pytest.HIDDEN_PARAM Causes Discovery Failure #25795metablock to thevscode_pytestdiscovery payload withsubprocessDurationMs,pluginDurationMs,payloadBytes,parametrizedTestCount, and fold it into the newUNITTEST.DISCOVERY.DONEevent so we can split slow discoveries into "subprocess vs plugin vs JS overhead". — new issuevscode_pytestpipe writer immune to mockedopen. When user test code doesmock.patch("builtins.open"), the pipe-writeropen()call gets intercepted and serialization breaks (surfaces asunsupported operand type(s) for +=: 'int' and 'NoneType'). Captureopenat import time before any test code can monkeypatch it. ([vscode-pytest]: unsupported operand type(s) for +=: 'int' and 'NoneType' #25793 closed without the fix landing.) — new issue or reopen [vscode-pytest]: unsupported operand type(s) for +=: 'int' and 'NoneType' #25793useEnvExtension()actually fires in VS Code 1.106+. Reporter pins a regression to that VS Code release. Telemetry-first: add anenvSourcefield to the discovery/run events, look at the live data, then fix. — pytest discovery does not use specified environment #25718Closing as already resolved
printhas already been removed frompython_files/unittestadapter/execution.py.How we measure progress
Baseline telemetry has already been wired up (TS-side). Each fix above has a corresponding metric so dashboards can verify the change actually moves the needle (and catch any unintended regressions):
UNITTEST.DISCOVERY.DONE.totalDurationMsp50/p90 sliced bytestCountbucket ×modemode='project'converges tomode='legacy'.UNITTEST.DISCOVERY.TRIGGER.fileKind='non-test'share;UNITTEST.TREE.UPDATE.rebuiltFromScratchshare;msSinceLastTriggerp50non-testshare drops to ~0%;rebuiltFromScratch=falseshare grows as incremental updates land.UNITTEST.RUN.DONE.missingCount > 0share;pipeClosedEarlyshare;failureCategorydistributionmissingCount>0andpipeClosedEarlyshares drop to near-0 onmode='project'anddebugging=true.UNITTEST.DISCOVERY.DONE.failureCategorydistributionPer-area success criteria are checked off as each fix ships and the telemetry confirms the change.
Out of scope (deliberately, for now)
added/removedcounts inUNITTEST.TREE.UPDATE— needs anO(n)set diff per discovery; revisit only ifbeforeCount/afterCount+rebuiltFromScratcharen't enough signal.