Skip to content

perf(datasource-toolkit): cache collection name & lookups in serialization#329

Merged
matthv merged 2 commits into
mainfrom
perf/serializer-collection-lookup
Jul 3, 2026
Merged

perf(datasource-toolkit): cache collection name & lookups in serialization#329
matthv merged 2 commits into
mainfrom
perf/serializer-collection-lookup

Conversation

@matthv

@matthv matthv commented Jul 3, 2026

Copy link
Copy Markdown
Member

Context

Serializing a collection list re-resolves collections per record and per relation by walking the decorator stacks. Profiling a 100-row related list (.../relationships/transactions?page[size]=100) with stackprof showed ~54% of request wall-time in:

  • ForestAdminDatasourceToolkit::Decorators::CollectionDecorator#name (46% self) — @child_collection.name re-chains the whole collection-decorator stack on every call (~89k calls over the burst);
  • ForestAdminDatasourceToolkit::Decorators::DatasourceDecorator#get_collection — re-descends the datasource-decorator stack on every call and recomputes collection.name to key its cache.

The JSON:API serializer (ForestSerializer#id, #relationships, #attributes) calls these once per record, so the cost is O(records × relations × decorator-depth). SQL for the same request is ~1ms.

Change

Memoize results that are immutable after the datasource is built:

  • CollectionDecorator#name caches into @memoized_name. (Not @name: the base Collection#initialize(datasource, name, …) sets @name, and CollectionDecorator#initialize(child_collection, datasource) calls super without parens, so @name already holds the datasource object — reusing it makes name return the datasource. A rubocop:disable Naming/MemoizedInstanceVariableName documents this.)
  • DatasourceDecorator#get_collection caches the decorator by the requested name (@collections_by_name), keeping the existing @decorators-by-resolved-name behaviour so a collection reachable under multiple names still gets a single decorator instance.
  • ForestSerializer#id caches primary keys per type (@@primary_keys) instead of resolving the collection and its keys for every record.

No behavioural change — names and primary keys are fixed after build; caches live on the persistent decorator/serializer objects and reset naturally on datasource reload.

Measured effect

Bench app (Postgres, 5000 transactions), median of 7, warm. Same JSON output:

endpoint before after
related list of 100 58 ms 9.6 ms
root list of 100 203 ms 27 ms
root list of 15 38 ms 10.5 ms
deep-relation list of 100 119 ms 16 ms
record show 11 ms 5.9 ms

The win applies to all list serialization, not just related lists (these lookups are on the render path of every collection).

Tests

Full package suites green: forest_admin_datasource_toolkit 463/0, forest_admin_agent 719/0, forest_admin_datasource_customizer 689/0. RuboCop clean.

🤖 Generated with Claude Code

Note

Cache collection name and primary-key lookups in datasource serialization

  • Memoizes the collection name in CollectionDecorator using a dedicated @memoized_name variable to avoid clashing with the base class @name.
  • Memoizes get_collection results by name in DatasourceDecorator via @collections_by_name, avoiding repeated child datasource lookups.
  • Caches primary keys per collection type in ForestSerializer using a class-level @@primary_keys hash, with a new reset_cache! method to clear it.
  • Calls ForestSerializer.reset_cache! during AgentFactory.reload! so stale primary-key data is cleared on agent reload.

Macroscope summarized ba5fa45.

Serializing a list re-resolved collections per record by walking the
decorator stacks. Profiling a 100-row related list showed ~54% of wall
time in CollectionDecorator#name (re-chaining child_collection.name on
every call) and DatasourceDecorator#get_collection (re-descending the
datasource stack on every call), invoked per record and per relation by
the JSON:API serializer.

Memoize the immutable results:
- CollectionDecorator#name caches into @memoized_name (not @name, which
  the base Collection already holds from the mis-ordered super).
- DatasourceDecorator#get_collection caches the decorator by name.
- ForestSerializer#id caches primary keys per type instead of resolving
  the collection and its keys for every record.

No behavioural change. On the bench app, list serialization drops ~7-10x
(related list of 100 rows: 58ms to 9.6ms; root list of 100: 203ms to 27ms).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `@@primary_keys` memo added to `ForestSerializer#id` is a class
variable that outlives `AgentFactory#reload!` (which rebuilds the
datasource via `container_replace`). If a collection's primary keys
change across a reload, records would keep serializing stale JSON:API
`id` values until process restart.

Reset it in `reload!` next to the existing route-cache reset. The
other two memoizations (`CollectionDecorator#name`,
`DatasourceDecorator#get_collection`) live on the decorator instances,
which are rebuilt on reload, so they need no explicit reset.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@matthv matthv merged commit ce3b83d into main Jul 3, 2026
48 checks passed
@matthv matthv deleted the perf/serializer-collection-lookup branch July 3, 2026 14:22
forest-bot added a commit that referenced this pull request Jul 3, 2026
## [1.35.2](v1.35.1...v1.35.2) (2026-07-03)

### Performance Improvements

* **datasource-toolkit:** cache collection name & lookups in serialization ([#329](#329)) ([ce3b83d](ce3b83d))
@forest-bot

Copy link
Copy Markdown
Member

🎉 This PR is included in version 1.35.2 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants