fix(FileStorage): harden GC against non-cache files (#45)#86
fix(FileStorage): harden GC against non-cache files (#45)#86OndraDol wants to merge 2 commits intonette:masterfrom
Conversation
…-cache files (nette#45) Replace unguarded unserialize() + assert(is_array()) with @unserialize() and is_array() runtime validation in readMetaAndLock(). Non-cache files (e.g. Latte lock files matching the _* pattern) now return null instead of crashing with "unserialize(): Error at offset". Add defensive @unserialize() in readData() to handle corrupted serialized data from truncated writes.
…tte#45) Add FileStorage.gc-foreign-files.phpt: - Verifies clean() does not crash when non-cache files (lock files, binary files, empty files) matching _* pattern exist in cache dir - Tests both collector mode (skip foreign files) and Cache::All mode (delete everything) Add FileStorage.corrupted.phpt: - Verifies graceful handling of corrupted metadata, corrupted data payload, truncated files, and all-zero files Enhance FileStorage.stress.phpt: - Place foreign files in cache dir during stress test to verify they don't interfere with concurrent cache operations
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8aa08c70a9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| $data === false => null, | ||
| empty($meta[self::MetaSerialized]) => $data, | ||
| default => unserialize($data), | ||
| default => @unserialize($data), // @ - data may be corrupted by a truncated write |
There was a problem hiding this comment.
Treat corrupted serialized payload as cache miss
When MetaSerialized is set, readData() now returns @unserialize($data) directly, so malformed payloads resolve to false instead of a miss. In Cache::load(), only null is treated as missing ($data === null), so a corrupted file can become a persistent cache hit with value false, skipping regeneration and returning incorrect data to callers. Please map unserialize failure to null (while preserving legitimate serialized false, e.g. b:0;).
Useful? React with 👍 / 👎.
Summary
Fix
clean()crashing withunserialize(): Error at offsetwhen non-cache files (e.g. Latte lock files) exist in the cache directory with names matching the_*pattern.Root cause
FileStorage::readMetaAndLock()callsunserialize()on arbitrary file contents without error handling. Theclean()method usesFinder::find('_*')to scan the cache directory, which matches all underscore-prefixed files — not just cache entries. When a non-cache file like_template.php.lockwith content0942e7is encountered:(int) "0942e7"→ non-zero sizestream_get_contents()reads garbage bytesunserialize()fails with "Error at offset X of Y bytes"This was diagnosed by @martinbohmcz in #45 (comment) — after renaming the directory to avoid the underscore prefix, the errors stopped.
Solution
readMetaAndLock()— Replace unguardedunserialize()+assert(is_array())with@unserialize()+is_array()runtime validation. Non-cache files now returnnullinstead of crashing. The@error suppression is consistent with other I/O operations in the same method (e.g.@fopenon line 303).readData()— Add defensive@unserialize()for corrupted serialized data payloads (e.g. from truncated writes after process crashes).What this does NOT change
The write path and locking model remain untouched. The current header-last write strategy (null bytes → data → seek → header) is a deliberate design that uses the header as a commit marker. True atomic writes via temp file + rename would require rethinking the stampede prevention locking model (currently file-handle based), which is a separate concern.
Changes by file
src/Caching/Storages/FileStorage.php— defensive unserialize inreadMetaAndLock()andreadData()tests/Storages/FileStorage.gc-foreign-files.phpt— new: regression test for GC with foreign files (6 scenarios: digit-prefixed content, binary, empty, zero-header, fake meta, nested subdirectory)tests/Storages/FileStorage.corrupted.phpt— new: graceful handling of corrupted metadata, corrupted data payload, truncated files, all-zero filestests/Storages/FileStorage.stress.phpt— enhanced: foreign files placed in cache dir during 1000-iteration stress testTesting
Related issue
Closes #45