fix(memos-local-plugin): two-phase migration to prevent crash-loop on large databases#1789
Open
chiefmojo wants to merge 1 commit into
Open
fix(memos-local-plugin): two-phase migration to prevent crash-loop on large databases#1789chiefmojo wants to merge 1 commit into
chiefmojo wants to merge 1 commit into
Conversation
… large databases
Migration 007 (namespace-visibility) runs UPDATE ... SET share_scope
and CREATE INDEX on the traces table inside a single db.tx(). On
databases larger than ~500MB, this exceeds the host gateway kill
timeout, SQLite rolls back the entire transaction (including the
schema_migrations INSERT), and the bridge restarts into the same
hang forever.
This splits the migration into two phases:
Phase 1 (inside transaction, ms): ADD COLUMN only on 12 namespace
tables plus DROP INDEX uq_skills_name. The schema_migrations
record commits here.
Phase 2 (after migration loop, outside any transaction): Batched
UPDATE in 2,000-row chunks (each its own implicit transaction)
for share_scope backfill, then CREATE INDEX IF NOT EXISTS for
all 18 owner/share indexes. Phase 2 also calls
ensureNamespaceColumns unconditionally so new tables added to
the namespace list get their columns on every boot.
Restart-safe: if the bridge is killed during Phase 2, the v7
schema_migrations record survives (Phase 1 committed). Next boot
skips Phase 1 entirely and resumes Phase 2 where it left off.
Closes MemTensor#1787
This was referenced May 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Migration 007 (namespace-visibility) in v2.0.2–v2.0.5 runs
UPDATE … SET share_scope='private'plusCREATE INDEXon the traces table inside a singledb.tx(). On databases larger than ~500MB, this exceeds the host gateway's kill timeout. SQLite rolls back the entire transaction — including theschema_migrationsINSERT — so migration 007 is never recorded and the bridge restarts into the same hang forever.Small databases (~43MB) complete within the timeout, which is why this only manifests on larger installs. Tested against a 687MB database with ~98,000 traces.
Fix
Two-phase migration:
ADD COLUMNonly on all 12 namespace tables, plusDROP INDEX uq_skills_name. Theschema_migrationsrecord for v7 commits here.UPDATEin 2,000-row chunks (each its own implicit transaction) forshare_scopebackfill, thenCREATE INDEX IF NOT EXISTSfor all 18 owner/share indexes.ensureNamespaceColumnsis called unconditionally on every boot so new tables in the namespace list get their columns.Restart safety: If the bridge is killed during Phase 2, the v7
schema_migrationsrecord survives (Phase 1 committed). Next boot skips Phase 1 entirely and resumes Phase 2 where it left off. The crash-loop is broken.Verification
pipeline.ready.Related
Closes #1787