Skip to content

Fix pathologically slow assertion diffs for large inputs (#8998)#14543

Open
kirilklein wants to merge 2 commits into
pytest-dev:mainfrom
kirilklein:fix-8998-large-diff-perf
Open

Fix pathologically slow assertion diffs for large inputs (#8998)#14543
kirilklein wants to merge 2 commits into
pytest-dev:mainfrom
kirilklein:fix-8998-large-diff-perf

Conversation

@kirilklein
Copy link
Copy Markdown

Closes #8998.

Problem

Comparing very large strings, lists, or dataclasses inside an assert can hang for a long time (sometimes minutes) while pytest builds the failure diff.

Profiling the reproductions from the issue confirms the root cause is difflib.ndiff:

  • its character-level "fancy replace" step is quadratic in the size of the differing region (so two large, mostly-different strings are catastrophic), and
  • the underlying SequenceMatcher is quadratic in the number of lines — a large nested structure pretty-prints to a huge number of lines (the dataclass example in the issue pformats to ~418,000 lines).

Approach

Following the maintainer discussion in the issue, this uses a deterministic size heuristic rather than wall-clock timeouts (which are non-deterministic and can't reliably interrupt difflib).

A new helper module _pytest/assertion/_diff.py provides:

  • ndiff_too_slow(left_lines, right_lines)True when the combined input exceeds a character budget or a line-count budget, the two dimensions that make ndiff slow.
  • fast_unified_diff(...) — a coarse but fast line-level difflib.unified_diff, capped to a bounded number of lines so it always completes in milliseconds. It notes in the output that a faster diff is being shown (and how many lines were hidden).

Both pathological call sites fall back to it when needed:

  • compare_text._diff_text (string comparisons)
  • _compare_sequence._compare_eq_iterable (list / dataclass / iterable comparisons)

Comparisons below the cutoffs keep the existing detailed ndiff output unchanged.

Results

On the reproductions from the issue (dataclass with large lists + two large random strings), with -v:

  • before: hangs (one repro profiled at ~384s of find_longest_match)
  • after: ~0.7s, with a useful fallback diff

Tests

Added regression tests in testing/test_assertion.py: unit tests for the ndiff_too_slow heuristic, and integration tests that large string / many-line / large-iterable comparisons fall back to the fast diff (no ndiff ? guide lines), still show which lines differ, and emit the line-cap notice. Thresholds were chosen from benchmarking.

🤖 Generated with Claude Code

…8998)

Comparing very large strings, lists, or dataclasses in an ``assert`` could
hang for a long time (sometimes minutes) while pytest built the failure diff.
The cost comes from ``difflib.ndiff``: its character-level "fancy replace"
step is quadratic in the size of the differing region, and the underlying
``SequenceMatcher`` is quadratic in the number of lines (a large nested
structure can pretty-print to hundreds of thousands of lines).

Add a deterministic size heuristic (no wall-clock timeouts, per the
maintainer discussion in the issue): when the input is too large for
``ndiff`` to be fast, fall back to a coarser line-level ``unified_diff``,
capped to a bounded number of lines so it always completes in milliseconds,
and note this in the output. Smaller comparisons keep the existing detailed
``ndiff`` output unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@psf-chronographer psf-chronographer Bot added the bot:chronographer:provided (automation) changelog entry is part of PR label Jun 1, 2026
@Pierre-Sassoulas
Copy link
Copy Markdown
Member

We have a flying MR to use generator in assert repr that could help with this when we don't have to show the actual output. (#14523)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided (automation) changelog entry is part of PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assert str1 == str2 takes forever with long strings that differ by a short prefix

2 participants