Skip to content

fix: reconcile missing zoekt shards on startup - Fixes #1210#1246

Open
Shashank200345 wants to merge 2 commits into
sourcebot-dev:mainfrom
Shashank200345:fix/rebuild-missing-zoekt-shards
Open

fix: reconcile missing zoekt shards on startup - Fixes #1210#1246
Shashank200345 wants to merge 2 commits into
sourcebot-dev:mainfrom
Shashank200345:fix/rebuild-missing-zoekt-shards

Conversation

@Shashank200345
Copy link
Copy Markdown

@Shashank200345 Shashank200345 commented May 29, 2026

  1. What was the problem

When the zoekt index directory gets wiped — say during a pod restart
with ephemeral storage — the database still holds onto the old
indexedAt timestamps. The scheduler sees those timestamps and assumes
everything is fine, so it never triggers a re-index. Search returns
nothing but the UI still shows the repo as Completed. No error, no
warning, just silently broken.

  1. Why it happened

cleanupOrphanedDiskResources() already handles one side of this —
it cleans up shard files on disk that have no matching DB record.
But nobody was checking the other direction: a repo marked as indexed
in the DB with no actual shard file on disk.

  1. What I changed

Added reconcileMissingShards() which runs at startup right after
cleanupOrphanedDiskResources(). It reads what shard files actually
exist on disk, cross checks against repos the DB thinks are indexed,
and clears indexedAt for anything that is missing a shard. The
existing scheduler then picks those repos up on its next poll and
re-indexes them automatically.

Only one file changed: packages/backend/src/repoIndexManager.ts

Reuses the existing getRepoIdFromShardFileName() utility so the
shard parsing logic stays consistent with the rest of the file.

  1. Proof

Logs on restart — fix detects the missing shard and resets the repo:

Screenshot 2026-05-30 002144

DB right after restart — indexedAt cleared to null:

Screenshot 2026-05-30 005447

DB after the scheduler runs — fresh timestamp, repo re-indexed automatically:

Screenshot 2026-05-30 003659

Summary by CodeRabbit

  • Bug Fixes
    • Fixed an issue where repositories marked as indexed could be skipped when on-disk index data was missing. The scheduler now detects missing index files, resets the indexed state for affected repositories, and ensures they are re-indexed so search and retrieval remain accurate.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 430eeb7e-dbe0-4582-a011-274d48a8d0b3

📥 Commits

Reviewing files that changed from the base of the PR and between cbf9b3c and e8f6b01.

📒 Files selected for processing (1)
  • packages/backend/src/repoIndexManager.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/backend/src/repoIndexManager.ts

Walkthrough

RepoIndexManager.startScheduler() now calls a new reconcileMissingShards() during startup. It reads on-disk shard filenames, queries the DB for all repos marked indexed, and resets indexedAt to null for repos missing shards to avoid stale DB state skipping re-indexing.

Changes

Missing shard reconciliation

Layer / File(s) Summary
Missing shard reconciliation implementation
packages/backend/src/repoIndexManager.ts
startScheduler() calls reconcileMissingShards() after disk cleanup. The new private method scans INDEX_CACHE_DIR for existing shard files, queries the DB for repos with indexedAt set, and for each DB-indexed repo missing its on-disk shard, resets indexedAt to null with warning and completion logging. Minor formatting added between cleanup sections.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • sourcebot-dev/sourcebot#305: Aligns with changes to indexedAt/syncedAt semantics around failed indexing and sync attempts.
  • sourcebot-dev/sourcebot#973: Both PRs modify startScheduler() during startup disk/DB consistency, with this PR adding reconcileMissingShards() immediately after the cleanupOrphanedDiskResources() introduced in that PR.

Suggested reviewers

  • msukkari
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding logic to reconcile missing zoekt shards during startup. It is concise, actionable, and directly related to the core fix in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/backend/src/repoIndexManager.ts`:
- Around line 784-793: reconcileMissingShards currently unconditionally clears
indexedAt for DB entries missing shard files, which loops when a repo
legitimately produced zero shards; modify the logic so you only clear indexedAt
when the missing-shard condition is truly stale by (a) adding a durable
marker/enum on the repo (e.g., lastIndexStatus or zeroShardFlag) that
indexGitRepository sets to "success_no_shard" when it completes with zero
emitted shards, or (b) requiring that the last successful indexedAt is older
than a configurable TTL and there is no recorded successful index result; update
reconcileMissingShards to check that marker/TTL before nulling indexedAt (and
update indexGitRepository to set the marker when it finishes with zero shards).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 83a82f54-566f-4f76-bd71-59bf60874d00

📥 Commits

Reviewing files that changed from the base of the PR and between 96af7e8 and cbf9b3c.

📒 Files selected for processing (1)
  • packages/backend/src/repoIndexManager.ts

Comment thread packages/backend/src/repoIndexManager.ts
@Shashank200345
Copy link
Copy Markdown
Author

Hi, I've addressed the CodeRabbit feedback by adding a TTL check
using reindexIntervalMs to avoid resetting repos that legitimately
produce zero shards. Happy to make any other changes if needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant