fix: reconcile missing zoekt shards on startup - Fixes #1210#1246
fix: reconcile missing zoekt shards on startup - Fixes #1210#1246Shashank200345 wants to merge 2 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughRepoIndexManager.startScheduler() now calls a new reconcileMissingShards() during startup. It reads on-disk shard filenames, queries the DB for all repos marked indexed, and resets indexedAt to null for repos missing shards to avoid stale DB state skipping re-indexing. ChangesMissing shard reconciliation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/repoIndexManager.ts`:
- Around line 784-793: reconcileMissingShards currently unconditionally clears
indexedAt for DB entries missing shard files, which loops when a repo
legitimately produced zero shards; modify the logic so you only clear indexedAt
when the missing-shard condition is truly stale by (a) adding a durable
marker/enum on the repo (e.g., lastIndexStatus or zeroShardFlag) that
indexGitRepository sets to "success_no_shard" when it completes with zero
emitted shards, or (b) requiring that the last successful indexedAt is older
than a configurable TTL and there is no recorded successful index result; update
reconcileMissingShards to check that marker/TTL before nulling indexedAt (and
update indexGitRepository to set the marker when it finishes with zero shards).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 83a82f54-566f-4f76-bd71-59bf60874d00
📒 Files selected for processing (1)
packages/backend/src/repoIndexManager.ts
|
Hi, I've addressed the CodeRabbit feedback by adding a TTL check |
When the zoekt index directory gets wiped — say during a pod restart
with ephemeral storage — the database still holds onto the old
indexedAt timestamps. The scheduler sees those timestamps and assumes
everything is fine, so it never triggers a re-index. Search returns
nothing but the UI still shows the repo as Completed. No error, no
warning, just silently broken.
cleanupOrphanedDiskResources() already handles one side of this —
it cleans up shard files on disk that have no matching DB record.
But nobody was checking the other direction: a repo marked as indexed
in the DB with no actual shard file on disk.
Added reconcileMissingShards() which runs at startup right after
cleanupOrphanedDiskResources(). It reads what shard files actually
exist on disk, cross checks against repos the DB thinks are indexed,
and clears indexedAt for anything that is missing a shard. The
existing scheduler then picks those repos up on its next poll and
re-indexes them automatically.
Only one file changed: packages/backend/src/repoIndexManager.ts
Reuses the existing getRepoIdFromShardFileName() utility so the
shard parsing logic stays consistent with the rest of the file.
Logs on restart — fix detects the missing shard and resets the repo:
DB right after restart — indexedAt cleared to null:
DB after the scheduler runs — fresh timestamp, repo re-indexed automatically:
Summary by CodeRabbit