Skip to content

Perf: cache docblock FQCN lookups#65

Merged
dereuromark merged 1 commit into
masterfrom
perf-fqcn-docblock-cache
May 13, 2026
Merged

Perf: cache docblock FQCN lookups#65
dereuromark merged 1 commit into
masterfrom
perf-fqcn-docblock-cache

Conversation

@dereuromark
Copy link
Copy Markdown
Contributor

Summary

Cache file-level work in FullyQualifiedClassNameInDocBlockSniff and avoid reprocessing the same doc block multiple times in a single phpcs pass.

This does two things:

  • deduplicates doc block processing inside the sniff's current broad registration set (T_CLASS, T_INTERFACE, T_TRAIT, T_FUNCTION, T_VARIABLE, T_COMMENT), so the same doc block is only parsed once per pass;
  • caches the file namespace and parsed use statements for the lifetime of that pass instead of rebuilding them for every class-name lookup.

The cache is pass-scoped, not content-scoped: it resets when token processing restarts from an earlier stack pointer, which is what phpcbf does after re-tokenizing a file for a new fix loop.

Why

After #63 and #64 landed, PhpCollective.Commenting.FullyQualifiedClassNameInDocBlock became the dominant repo-local hotspot on large docblock-heavy files.

The old code had two expensive patterns:

  • parseUseStatements() walked the whole file every time findUseStatementForClassName() was called;
  • the same related doc block could be processed repeatedly because multiple registered token types on the same declaration path all resolved to the same doc block.

That meant a large method-heavy file with repeated @param / @return / @throws tags paid for the same file-level scans many times.

Behavior

This also removes duplicate error reporting for the same doc block. The existing fixture still fixes to the same output, but the unique violation count drops from 11 to 7, which is what the updated test now asserts.

Benchmark

Measured on a synthetic large file with 430 repeated methods and docblocks:

  • before: PhpCollective.Commenting.FullyQualifiedClassNameInDocBlock about 49.99s
  • after: PhpCollective.Commenting.FullyQualifiedClassNameInDocBlock about 1.35s

Verification

  • composer cs-fix
  • composer test
  • composer stan

@dereuromark dereuromark merged commit a0ec23c into master May 13, 2026
4 checks passed
@dereuromark dereuromark deleted the perf-fqcn-docblock-cache branch May 13, 2026 21:53
dereuromark added a commit that referenced this pull request May 13, 2026
UseStatementSniff has its own getUseStatements() implementation that
duplicates UseStatementsTrait::getUseStatements() (cached in #64). It
already has an instance-level cache via existingStatements, which
covers repeated calls within a single phpcs pass; the new static
cache adds coverage across phpcbf fix iterations, where
populateTokenListeners() creates fresh sniff instances per pass and
resets the instance cache.

Cache invalidation follows the same fingerprint-based scheme as #64:
token count alone is not strong enough, since an alias rename keeps
it constant. Cached entries record a content fingerprint of each
use statement range and re-verify them against the live tokens
before being trusted. The cache also refuses to serve an empty
result so it cannot return stale state for a file where a fix added
a first use statement while another simultaneous fix happened to
keep the file's overall token count unchanged.

Measured on the same CakePHP 5 app from #62 / #63 / #64 / #65
(parallel=1, --report=performance):

  PhpCollective.Namespaces.UseStatement   3.36s -> 2.08s

Existing test suite (100 tests / 122 assertions) passes unchanged.

The FQCN cache changes from the earlier draft of this PR were
dropped in favour of the cache that landed via #65.
dereuromark added a commit that referenced this pull request May 13, 2026
…#66)

UseStatementSniff has its own getUseStatements() implementation that
duplicates UseStatementsTrait::getUseStatements() (cached in #64). It
already has an instance-level cache via existingStatements, which
covers repeated calls within a single phpcs pass; the new static
cache adds coverage across phpcbf fix iterations, where
populateTokenListeners() creates fresh sniff instances per pass and
resets the instance cache.

Cache invalidation follows the same fingerprint-based scheme as #64:
token count alone is not strong enough, since an alias rename keeps
it constant. Cached entries record a content fingerprint of each
use statement range and re-verify them against the live tokens
before being trusted. The cache also refuses to serve an empty
result so it cannot return stale state for a file where a fix added
a first use statement while another simultaneous fix happened to
keep the file's overall token count unchanged.

Measured on the same CakePHP 5 app from #62 / #63 / #64 / #65
(parallel=1, --report=performance):

  PhpCollective.Namespaces.UseStatement   3.36s -> 2.08s

Existing test suite (100 tests / 122 assertions) passes unchanged.

The FQCN cache changes from the earlier draft of this PR were
dropped in favour of the cache that landed via #65.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant