fix(files_external): S3 folder mtime updated on every read-only access#59539
Draft
fix(files_external): S3 folder mtime updated on every read-only access#59539
Conversation
…he key Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
… from stat Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
33f416a to
4930285
Compare
Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Browsing folders, or downloading from a public share (when backed by S3 external storage) incorrectly propagated the mtime to parent folder, even though there were no write operations to storage. This made folders appear modified 'a few seconds ago'.
How to reproduce
Root cause (AI-generated summary)
The bug is a chain of three compounding issues, all in the S3 external storage path.
Background: why the watcher fires on every folder access
When Sabre DAV resolves a path like
subfolder/file.txt, it callsView::getFileInfo()for each component ('',subfolder,subfolder/file.txt). For external storage, each call entersView::getCacheEntry(), which runs the watcher chain:AmazonS3::hasUpdated()always returnstruefor directory paths because S3 has no real directory objects — only files and common prefixes. As a result the watcher fires on every folder access, andpropagateChange(path, time())updates all ancestor folders in the filecache with the current PHP timestamp.Bug 1 — Root path
'.'vs''mismatch ingetDirectoryMetaDataAmazonS3::normalizePath('')returns'.'(used as the S3 object key sentinel for the bucket root). WhengetDirectoryMetaData('.')was called it looked upgetCache()->get('.'), but the filecache stores the storage root under the key''. The lookup always returnedfalse, sogetDirectoryMetaDatafell back to returning a fabricated array with'storage_mtime' => time().This meant
storage_mtimefor the root appeared to change on every scan (a newtime()on every call), which madeView::getCacheEntryalways see a change and always firepropagateChange.Fix: Use
$cachePath = $path === '.' ? '' : $pathbefore the cache lookup.Bug 2 —
Common::getMetaDatadiscardsstorage_mtimefor S3 directoriesAmazonS3does not overridegetMetaData, so the scanner usesCommon::getMetaData, which does:For S3 virtual directories,
mtimeandstorage_mtimeare stored as separate fields in the filecache.mtimecan be legitimately updated by mtime propagation (e.g. when a child file changes), whilestorage_mtimeshould only change when the storage itself reports a change. By conflating them,Common::getMetaDatareturns astorage_mtimeequal tomtime. Ifmtimewas ever updated by any prior propagation, it no longer matches the cachedstorage_mtime, so the scanner sees a false change, writes the corrupted value back into the cache, and the cycle repeats.Fix:
AmazonS3now overridesgetMetaData. For directory entries it readsstorage_mtimedirectly fromstat()(which reads the filecache) and restores it afterparent::getMetaDatahas overwritten it.Bug 3 — Unconditional
propagateChangeinView::getCacheEntryEven when the watcher runs and the scanner finds nothing has changed,
View::getCacheEntryunconditionally calledpropagateChange(internalPath, time())wheneverwatcher->needsUpdate()returnedtrue. For S3, this is every directory access. The propagator updates all ancestor rows in filecache withGREATEST(mtime, time()), stamping every parent folder with the access timestamp.Fix: Capture
storage_mtimebefore and afterwatcher->update(). CallpropagateChangeonly whenstorageMtimeAfter !== storageMtimeBefore— i.e., only when the scanner actually detected a change on the backend.TODO
Checklist
3. to review, feature component)stable32)AI (if applicable)