Add cached find support#894
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a generic ChangesCache-aside helper and find cache-key infrastructure
Sequence Diagram(s)sequenceDiagram
participant Client
participant Database
participant Cache as Cache Adapter
participant Callback
Client->>Database: cachedFind(collection, queries)
Database->>Database: getFindCacheKey()
Database->>Database: getFindCacheField()
Database->>Cache: load(key, hash)
alt Cache hit and value exists
Cache-->>Database: cached document arrays
else Cache miss or false
Cache-->>Database: null
Database->>Callback: invoke callback via withCache()
Callback->>Database: find() results
end
Database->>Database: recreate documents + authorize
Database-->>Client: documents
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR introduces a generic
Confidence Score: 3/5The read path is correct, but purgeQueryCache() drops all hash fields under the shared key, undoing the per-database read isolation that getQueryCacheField() establishes. purgeQueryCache() calls cache->purge($key) without a hash argument, which drops every field under the key regardless of which database wrote it. getQueryCacheField() deliberately includes the database name in the field hash for per-database isolation on reads, but that isolation is broken on every purge. Any caller running purgeQueryCache in one database context silently invalidates cached results for all other databases sharing the same namespace/tenant/hostname/collection. src/Database/Database.php — specifically purgeQueryCache() and getQueryCacheKey() Important Files Changed
Reviews (38): Last reviewed commit: "Rename list cache helpers to query cache" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 8625-8642: The cache hit path in the conditional block starting
with if ($cached !== null && $cached !== false && \is_array($cached)) only wraps
cached arrays into Document instances using createDocumentInstance, but does not
hydrate relationship fields that may have been serialized as arrays. This
results in different document shapes being returned compared to the cache miss
path which uses find() that returns fully processed documents. Either ensure the
cached hit path recursively hydrates relationship fields to match the shape
returned by find(), or add logic to bypass caching for finds that request
relationship-resolved data to avoid returning inconsistent types between cache
hits and misses.
- Around line 9563-9564: The cache key generation logic in the findCached method
is incorrectly dropping the serialized queries when a $key is provided. Change
the ternary expression that sets the queries field to always call
serializeFindCacheQueries($queries) instead of returning null when $key is set.
The caller key should add an additional cache dimension alongside the query
fingerprint, not replace it, so both the serialized queries and the key must be
included in the cache identifier to prevent different queries with the same key
from returning incorrect cached results.
- Around line 9572-9584: The getFindCacheSchemaHash method currently only
includes collection attributes and indexes in the hash calculation, but the
find() authorization check also validates collection permissions and
documentSecurity settings. When collection authorization policies change, cached
results remain valid incorrectly because the schema hash doesn't reflect these
changes. Modify the getFindCacheSchemaHash method to include the collection's
permissions and documentSecurity flag in the hash payload alongside the existing
attributes and indexes. Additionally, replace the MD5 hash algorithm with a
stronger digest function such as SHA256 to address the static analysis flag
regarding weak hashing.
In `@tests/unit/FindCacheTest.php`:
- Around line 140-144: The `HashMemoryCache::save` method incorrectly prevents
caching of empty arrays because the condition `empty($data)` returns true for
empty arrays and rejects them. Modify the validation logic to only reject empty
strings while allowing empty arrays to be cached as valid results. Change the
condition to specifically check if `$data` is an empty string or if `$key` is
empty, rather than using `empty()` which treats empty arrays the same as empty
strings.
- Around line 83-99: The test testFindCachedTriggersFindEventOnCacheHit does not
actually verify a cache hit because both cache hits and cache misses emit
EVENT_DOCUMENT_FIND. To fix this, add data mutation between the two findCached
calls to force a true cache hit scenario. After the first findCached call on
line 92, modify the seeded project data in the database, then verify that the
second findCached call on line 93 returns the stale cached result from the first
call rather than the modified data, which would definitively prove the second
call was a cache hit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d0eed82b-9eca-489f-a0fd-19748bc2694c
📒 Files selected for processing (4)
src/Database/Database.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/CacheKeyTest.phptests/unit/FindCacheTest.php
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/Database/Database.php (2)
8611-8617:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winValidate queries before serializing the cache key.
findCached()reachesserializeQueriesForFindCache()beforefind()runscheckQueryTypes(), so malformed query arrays can fail outside the normalQueryExceptionpath on cached calls.Proposed fix
if ($collectionDocument->isEmpty()) { throw new NotFoundException('Collection not found'); } + $this->checkQueryTypes($queries); + [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 8611 - 8617, In the findCached() method, validate the queries before they are serialized for the cache key. Call checkQueryTypes() on the queries parameter before invoking getFindCacheKeys(), which calls serializeQueriesForFindCache(). This ensures malformed query arrays are caught early with proper QueryException handling rather than failing during serialization.
8617-8639:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid serving document-security caches after permission revocation.
The cache hit path returns serialized documents without re-reading current document permissions. If a cached document later loses
$permissions, the same role can still receive it until the find TTL expires. Until permission-changing writes purge affected find variants, bypass caching fordocumentSecuritycollections.Conservative fix
if ($collectionDocument->isEmpty()) { throw new NotFoundException('Collection not found'); } + if ($collectionDocument->getAttribute('documentSecurity', false)) { + return $this->find($collectionDocument->getId(), $queries, $forPermission); + } + [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 8617 - 8639, The current caching logic in the find method returns cached documents without re-validating permissions, which creates a security vulnerability for documentSecurity collections where permissions may have been revoked. Add a check using the $collectionDocument to determine if the collection is a documentSecurity collection, and if so, skip the entire cache loading and cache return logic (the try-catch block for $this->cache->load() and the conditional check for $cached !== null) to force a fresh permission check on every query. This ensures that permission changes are respected immediately rather than serving stale cached results until the TTL expires.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/Database/Database.php`:
- Around line 8611-8617: In the findCached() method, validate the queries before
they are serialized for the cache key. Call checkQueryTypes() on the queries
parameter before invoking getFindCacheKeys(), which calls
serializeQueriesForFindCache(). This ensures malformed query arrays are caught
early with proper QueryException handling rather than failing during
serialization.
- Around line 8617-8639: The current caching logic in the find method returns
cached documents without re-validating permissions, which creates a security
vulnerability for documentSecurity collections where permissions may have been
revoked. Add a check using the $collectionDocument to determine if the
collection is a documentSecurity collection, and if so, skip the entire cache
loading and cache return logic (the try-catch block for $this->cache->load() and
the conditional check for $cached !== null) to force a fresh permission check on
every query. This ensures that permission changes are respected immediately
rather than serving stale cached results until the TTL expires.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 29c2ae6a-250e-4626-8ae6-59effc86c769
📒 Files selected for processing (5)
src/Database/Database.phptests/e2e/Adapter/Base.phptests/e2e/Adapter/RedisTest.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/FindCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/e2e/Adapter/Base.php
- tests/e2e/Adapter/RedisTest.php
- tests/e2e/Adapter/Scopes/DocumentTests.php
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Line 8659: The condition on line 8659 checks `if ($touchOnHit &&
!$hasExpiredDocuments)`, but the `!$hasExpiredDocuments` part is redundant
because the code already guarantees this value through an early return statement
in the preceding lines (8646-8657). Remove the `&& !$hasExpiredDocuments` clause
from the condition, leaving only `if ($touchOnHit)`, to eliminate the
always-true condition that is causing the PHPStan error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 647e586e-bf40-4233-b042-568a6b0b5436
📒 Files selected for processing (4)
src/Database/Database.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/CacheKeyTest.phptests/unit/FindCacheTest.php
💤 Files with no reviewable changes (1)
- tests/unit/CacheKeyTest.php
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/e2e/Adapter/Scopes/DocumentTests.php
- tests/unit/FindCacheTest.php
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/Database/Database.php (1)
9567-9572:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winStrengthen find-cache fingerprint to prevent stale authorization reuse.
getFindCacheField()/getFindCacheSchemaHash()still use MD5 and the schema hash excludes collection policy fields. Since cache hits bypassfind()checks, a collection policy change ($permissions/documentSecurity) can keep an old cache field valid for the same role hash until TTL expiry.Suggested patch
- return \sprintf( - '%s:%s:%s:%s', - $this->getFindCacheSchemaHash($collection), - \md5(\json_encode($roles) ?: ''), - \md5(\json_encode($queryPayload) ?: ''), - $field, - ); + return \sprintf( + '%s:%s:%s:%s', + $this->getFindCacheSchemaHash($collection), + \hash('sha256', \json_encode($roles) ?: ''), + \hash('sha256', \json_encode($queryPayload) ?: ''), + $field, + ); @@ - return \md5( - \json_encode($collection->getAttribute('attributes', [])) - . \json_encode($collection->getAttribute('indexes', [])) - ); + return \hash('sha256', \json_encode([ + 'attributes' => $collection->getAttribute('attributes', []), + 'indexes' => $collection->getAttribute('indexes', []), + 'permissions' => $collection->getAttribute('$permissions', []), + 'documentSecurity' => $collection->getAttribute('documentSecurity', false), + ]) ?: '');Also applies to: 9621-9630
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Database/Database.php` around lines 9567 - 9572, The find-cache fingerprint in getFindCacheField() currently uses weak MD5 hashing and the schema hash from getFindCacheSchemaHash() excludes collection policy fields (permissions and documentSecurity). Replace the MD5 hash algorithm with a stronger alternative like SHA-256 for both the roles and queryPayload hashes in the cache key. Additionally, update the schema hash calculation to include collection policy fields so that changes to permissions or documentSecurity invalidate the cache and prevent stale authorization reuse. Apply these changes consistently across both getFindCacheField() and getFindCacheSchemaHash() methods.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/Database/Database.php`:
- Around line 9567-9572: The find-cache fingerprint in getFindCacheField()
currently uses weak MD5 hashing and the schema hash from
getFindCacheSchemaHash() excludes collection policy fields (permissions and
documentSecurity). Replace the MD5 hash algorithm with a stronger alternative
like SHA-256 for both the roles and queryPayload hashes in the cache key.
Additionally, update the schema hash calculation to include collection policy
fields so that changes to permissions or documentSecurity invalidate the cache
and prevent stale authorization reuse. Apply these changes consistently across
both getFindCacheField() and getFindCacheSchemaHash() methods.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1171458a-1aba-44ff-b664-ab2433babf95
📒 Files selected for processing (3)
src/Database/Database.phptests/unit/CacheKeyTest.phptests/unit/ListCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/CacheKeyTest.php
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/ListCacheTest.php (1)
157-157:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake hash-field assertion order-insensitive to avoid flaky tests.
$cache->list('key')ordering may vary by adapter implementation; asserting exact sequence can create nondeterministic failures. Prefer canonicalizing comparison or sorting before assert.Suggested test adjustment
- $this->assertSame(['first-field', 'second-field'], $cache->list('key')); + $this->assertEqualsCanonicalizing(['first-field', 'second-field'], $cache->list('key'));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/ListCacheTest.php` at line 157, The assertSame assertion on the $cache->list('key') call is comparing arrays in strict order, which can cause flaky tests since different cache adapter implementations may return hash fields in varying orders. Modify the assertion to be order-insensitive by either sorting both the expected and actual arrays before comparison using sort(), or use an assertion method like assertEqualsCanonicalizing that performs order-insensitive comparison. This ensures the test validates the correct fields are returned regardless of the order in which they appear.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 8588-8603: The cache key hash calculation in the withCache call is
not including the $forPermission parameter, which means different permission
levels (read, update, delete) will share the same cache entry. Update the
getFindCacheField method call to include $forPermission as an additional
parameter in the hash calculation so that the full authorization context is
properly fingerprinted. This ensures that a cached result for read permissions
cannot be incorrectly reused for update or delete permission requests with the
same query and roles.
In `@tests/unit/ListCacheTest.php`:
- Around line 219-221: The test for cachedFind method (lines 219-221 and similar
occurrences at 228-230 and 234-239) assumes deterministic ordering of results
without explicitly specifying a sort order, which can cause instability across
different database adapters. Either add an explicit order clause to the queries
parameter passed to cachedFind (for example, sorting by ID) to ensure consistent
ordering, or refactor the assertions to compare results canonically without
relying on index position (such as extracting and comparing IDs in a set-based
manner rather than checking first[0]).
---
Outside diff comments:
In `@tests/unit/ListCacheTest.php`:
- Line 157: The assertSame assertion on the $cache->list('key') call is
comparing arrays in strict order, which can cause flaky tests since different
cache adapter implementations may return hash fields in varying orders. Modify
the assertion to be order-insensitive by either sorting both the expected and
actual arrays before comparison using sort(), or use an assertion method like
assertEqualsCanonicalizing that performs order-insensitive comparison. This
ensures the test validates the correct fields are returned regardless of the
order in which they appear.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 45e5f395-111c-4d05-82c3-18f6f0cc29eb
📒 Files selected for processing (2)
src/Database/Database.phptests/unit/ListCacheTest.php
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/ListCacheTest.php (1)
515-559:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
testCachedFindFiltersTtlExpiredDocumentsOnCacheHitno longer validates the cache-hit path.At Line 550, the document is already expired before the first
cachedFind(), so it is filtered out on miss and cached as[]. The second call then just reads[], which doesn’t verify hit-time TTL filtering logic. Seed the cache with an expired payload (or otherwise force expiry between miss and hit) to exercise the intended path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/ListCacheTest.php` around lines 515 - 559, The test `testCachedFindFiltersTtlExpiredDocumentsOnCacheHit` is not properly validating the cache-hit filtering path because the document expires before the first `cachedFind()` call, resulting in an empty cached result that the second call just retrieves. To test the intended behavior of TTL filtering on cache hits, modify the test so that the document is not expired when the first `cachedFind()` call is made (so it gets cached with actual data), then force the document to expire before the second `cachedFind()` call (either by advancing time, creating a new document with a later expiration, or directly seeding the cache with an expired payload). This ensures the second call exercises the logic that filters expired documents from a cached result.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tests/unit/ListCacheTest.php`:
- Around line 515-559: The test
`testCachedFindFiltersTtlExpiredDocumentsOnCacheHit` is not properly validating
the cache-hit filtering path because the document expires before the first
`cachedFind()` call, resulting in an empty cached result that the second call
just retrieves. To test the intended behavior of TTL filtering on cache hits,
modify the test so that the document is not expired when the first
`cachedFind()` call is made (so it gets cached with actual data), then force the
document to expire before the second `cachedFind()` call (either by advancing
time, creating a new document with a later expiration, or directly seeding the
cache with an expired payload). This ensures the second call exercises the logic
that filters expired documents from a cached result.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 57a03272-7eb5-437b-afa1-ef6c4c796370
📒 Files selected for processing (3)
src/Database/Database.phptests/unit/CacheKeyTest.phptests/unit/ListCacheTest.php
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 8608-8615: The cachedFind method caches document payloads without
filtering TTL-expired documents, causing expired documents to remain visible
until the cache entry expires. Filter out TTL-expired documents in two places
within cachedFind: first, in the callback function shown (after the find call
returns documents but before they are array-mapped and stored in cache), and
second, in the cache hit scenario (around lines 8635-8647, before returning the
cached payload). Use the isTtlExpired method to identify and exclude expired
documents from both the cache miss result being saved and the cache hit payload
being returned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cc6fd113-7a02-40de-8fe9-c1c00576e805
📒 Files selected for processing (3)
src/Database/Database.phptests/unit/CacheKeyTest.phptests/unit/ListCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/CacheKeyTest.php
Summary
Database::withCache()with optional genericencodeanddecodecallbackswithCache()value-agnostic by default: existing callers continue to cache and return values as-isdecode()returnsfalse, so callers can recover from invalid or stale cached values without duplicating purge/refetch logicDatabase::cachedFind()wrapper in favor of explicit caller-ownedwithCache()compositiongetQueryCacheKey()getQueryCacheField()restoreQueryCacheDocuments()purgeQueryCache()Query::orderRandom()cache bypass as caller policy: callers should skipwithCache()for random orderingUsage
Callers cache query-derived values by composing the Database-owned key helpers, generic cache transform hooks, and optional document restore helper:
Invalidate cached query fields for the same collection namespace after writes with:
For scalar values and arrays made only of scalar/list/map data, callers can omit
encodeanddecodeand usewithCache()as before.Cache Shape
Query-cache entries use:
The hash field is:
The field can distinguish cached value types such as
documents,total,count, orsumunder the same query-state hash.Tests
composer formatcomposer lintvendor/bin/phpunit tests/unit/QueryCacheTest.php tests/unit/CacheKeyTest.phpcomposer checkis currently blocked locally by the existingConnection.phpPHPStan issue whereSwoole\Database\DetectsLostConnectionsis not discovered by the local analyzer setup.