Skip to content

fix(files_external): S3 folder mtime updated on every read-only access#59539

Draft
Antreesy wants to merge 5 commits intomasterfrom
fix/noid/s3-mtime-debug
Draft

fix(files_external): S3 folder mtime updated on every read-only access#59539
Antreesy wants to merge 5 commits intomasterfrom
fix/noid/s3-mtime-debug

Conversation

@Antreesy
Copy link
Copy Markdown
Contributor

@Antreesy Antreesy commented Apr 9, 2026

  • Resolves: #

Summary

Browsing folders, or downloading from a public share (when backed by S3 external storage) incorrectly propagated the mtime to parent folder, even though there were no write operations to storage. This made folders appear modified 'a few seconds ago'.

How to reproduce

  • Connect an S3 external storage and populate it (tested with rustfs S3)
  • Create a public share, verify folder mtimes are not updated
  • Reload the page, see 'Modified' date updated
  • From a public link, access share, read the folder content, download the file
  • See 'Modified' date updated

Root cause (AI-generated summary)

The bug is a chain of three compounding issues, all in the S3 external storage path.

Background: why the watcher fires on every folder access

When Sabre DAV resolves a path like subfolder/file.txt, it calls View::getFileInfo() for each component ('', subfolder, subfolder/file.txt). For external storage, each call enters View::getCacheEntry(), which runs the watcher chain:

watcher->needsUpdate(path, cachedStorageMtime)
  └── storage->hasUpdated(path, cachedStorageMtime)  ← S3: always true for dirs
watcher->update(path, data)   ← rescans via Scanner::SCAN_SHALLOW
propagateChange(path, time()) ← updates ancestor mtimes in filecache

AmazonS3::hasUpdated() always returns true for directory paths because S3 has no real directory objects — only files and common prefixes. As a result the watcher fires on every folder access, and propagateChange(path, time()) updates all ancestor folders in the filecache with the current PHP timestamp.


Bug 1 — Root path '.' vs '' mismatch in getDirectoryMetaData

AmazonS3::normalizePath('') returns '.' (used as the S3 object key sentinel for the bucket root). When getDirectoryMetaData('.') was called it looked up getCache()->get('.'), but the filecache stores the storage root under the key ''. The lookup always returned false, so getDirectoryMetaData fell back to returning a fabricated array with 'storage_mtime' => time().

This meant storage_mtime for the root appeared to change on every scan (a new time() on every call), which made View::getCacheEntry always see a change and always fire propagateChange.

Fix: Use $cachePath = $path === '.' ? '' : $path before the cache lookup.

Bug 2 — Common::getMetaData discards storage_mtime for S3 directories

AmazonS3 does not override getMetaData, so the scanner uses Common::getMetaData, which does:

$data['mtime'] = $this->filemtime($path);   // reads stat()['mtime']
$data['storage_mtime'] = $data['mtime'];    // overwrites storage_mtime with mtime

For S3 virtual directories, mtime and storage_mtime are stored as separate fields in the filecache. mtime can be legitimately updated by mtime propagation (e.g. when a child file changes), while storage_mtime should only change when the storage itself reports a change. By conflating them, Common::getMetaData returns a storage_mtime equal to mtime. If mtime was ever updated by any prior propagation, it no longer matches the cached storage_mtime, so the scanner sees a false change, writes the corrupted value back into the cache, and the cycle repeats.

Fix: AmazonS3 now overrides getMetaData. For directory entries it reads storage_mtime directly from stat() (which reads the filecache) and restores it after parent::getMetaData has overwritten it.

Bug 3 — Unconditional propagateChange in View::getCacheEntry

Even when the watcher runs and the scanner finds nothing has changed, View::getCacheEntry unconditionally called propagateChange(internalPath, time()) whenever watcher->needsUpdate() returned true. For S3, this is every directory access. The propagator updates all ancestor rows in filecache with GREATEST(mtime, time()), stamping every parent folder with the access timestamp.

Fix: Capture storage_mtime before and after watcher->update(). Call propagateChange only when storageMtimeAfter !== storageMtimeBefore — i.e., only when the scanner actually detected a change on the backend.


Before After
image image

TODO

  • ...

Checklist

AI (if applicable)

  • The content of this PR was partly or fully generated using AI

@Antreesy Antreesy added this to the Nextcloud 34 milestone Apr 9, 2026
@Antreesy Antreesy self-assigned this Apr 9, 2026
Antreesy added 4 commits April 9, 2026 16:51
…he key

Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
… from stat

Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
@Antreesy Antreesy force-pushed the fix/noid/s3-mtime-debug branch from 33f416a to 4930285 Compare April 9, 2026 14:51
Signed-off-by: Maksim Sukharev <antreesy.web@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant