perf(datasource-toolkit): cache collection name & lookups in serialization#329
Merged
Conversation
Serializing a list re-resolved collections per record by walking the decorator stacks. Profiling a 100-row related list showed ~54% of wall time in CollectionDecorator#name (re-chaining child_collection.name on every call) and DatasourceDecorator#get_collection (re-descending the datasource stack on every call), invoked per record and per relation by the JSON:API serializer. Memoize the immutable results: - CollectionDecorator#name caches into @memoized_name (not @name, which the base Collection already holds from the mis-ordered super). - DatasourceDecorator#get_collection caches the decorator by name. - ForestSerializer#id caches primary keys per type instead of resolving the collection and its keys for every record. No behavioural change. On the bench app, list serialization drops ~7-10x (related list of 100 rows: 58ms to 9.6ms; root list of 100: 203ms to 27ms). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `@@primary_keys` memo added to `ForestSerializer#id` is a class variable that outlives `AgentFactory#reload!` (which rebuilds the datasource via `container_replace`). If a collection's primary keys change across a reload, records would keep serializing stale JSON:API `id` values until process restart. Reset it in `reload!` next to the existing route-cache reset. The other two memoizations (`CollectionDecorator#name`, `DatasourceDecorator#get_collection`) live on the decorator instances, which are rebuilt on reload, so they need no explicit reset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bexchauveto
approved these changes
Jul 3, 2026
forest-bot
added a commit
that referenced
this pull request
Jul 3, 2026
## [1.35.2](v1.35.1...v1.35.2) (2026-07-03) ### Performance Improvements * **datasource-toolkit:** cache collection name & lookups in serialization ([#329](#329)) ([ce3b83d](ce3b83d))
Member
|
🎉 This PR is included in version 1.35.2 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Serializing a collection list re-resolves collections per record and per relation by walking the decorator stacks. Profiling a 100-row related list (
.../relationships/transactions?page[size]=100) with stackprof showed ~54% of request wall-time in:ForestAdminDatasourceToolkit::Decorators::CollectionDecorator#name(46% self) —@child_collection.namere-chains the whole collection-decorator stack on every call (~89k calls over the burst);ForestAdminDatasourceToolkit::Decorators::DatasourceDecorator#get_collection— re-descends the datasource-decorator stack on every call and recomputescollection.nameto key its cache.The JSON:API serializer (
ForestSerializer#id,#relationships,#attributes) calls these once per record, so the cost is O(records × relations × decorator-depth). SQL for the same request is ~1ms.Change
Memoize results that are immutable after the datasource is built:
CollectionDecorator#namecaches into@memoized_name. (Not@name: the baseCollection#initialize(datasource, name, …)sets@name, andCollectionDecorator#initialize(child_collection, datasource)callssuperwithout parens, so@namealready holds the datasource object — reusing it makesnamereturn the datasource. Arubocop:disable Naming/MemoizedInstanceVariableNamedocuments this.)DatasourceDecorator#get_collectioncaches the decorator by the requested name (@collections_by_name), keeping the existing@decorators-by-resolved-name behaviour so a collection reachable under multiple names still gets a single decorator instance.ForestSerializer#idcaches primary keys per type (@@primary_keys) instead of resolving the collection and its keys for every record.No behavioural change — names and primary keys are fixed after build; caches live on the persistent decorator/serializer objects and reset naturally on datasource reload.
Measured effect
Bench app (Postgres, 5000 transactions), median of 7, warm. Same JSON output:
The win applies to all list serialization, not just related lists (these lookups are on the render path of every collection).
Tests
Full package suites green:
forest_admin_datasource_toolkit463/0,forest_admin_agent719/0,forest_admin_datasource_customizer689/0. RuboCop clean.🤖 Generated with Claude Code
Note
Cache collection name and primary-key lookups in datasource serialization
CollectionDecoratorusing a dedicated@memoized_namevariable to avoid clashing with the base class@name.get_collectionresults by name inDatasourceDecoratorvia@collections_by_name, avoiding repeated child datasource lookups.ForestSerializerusing a class-level@@primary_keyshash, with a newreset_cache!method to clear it.ForestSerializer.reset_cache!duringAgentFactory.reload!so stale primary-key data is cleared on agent reload.Macroscope summarized ba5fa45.