fix: bq depsdev deadlock and a couple of fixes and optimizations#4238
fix: bq depsdev deadlock and a couple of fixes and optimizations#4238themarolt wants to merge 22 commits into
Conversation
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
|
Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability. Example:
Projects:
Please add a Jira issue key to your PR title. |
There was a problem hiding this comment.
⚠️ Not ready to approve
There are a few concrete issues in the changed code (notably workflow-run detection in scripts/cli and a missing maintenance_work_mem application that the code comments rely on) that should be corrected before merge.
Pull request overview
This PR hardens and speeds up the deps.dev BigQuery → Postgres bootstrap pipeline at very large scale (≈1B package_dependencies rows), addressing deadlock risk, premature “done” marking, missing version_constraint backfills, and improving operational visibility (job-step tracking, monitor UI fixes, deploy script workflow monitoring).
Changes:
- Adds step/meta tracking to ingest jobs (
meta:step,meta:ecosystems) and improves the terminal monitor UX (stuck detection, scroll, column sizing, truncation fixes). - Improves full-load correctness + performance (parallel index builds, parallel dedup, defers
isFinal: trueuntil after rebuild/validation; adds--fill-constraintsbackfill mode). - Adds operational tooling: deploy workflow monitoring in
scripts/cliand a one-shotdedup-package-depsrecovery script.
File summaries
| File | Description |
|---|---|
| services/libs/data-access-layer/src/osspckgs/ingestJobs.ts | Expands tableRowCounts typing and adds helper to merge display/meta keys into table_row_counts. |
| services/apps/packages_worker/src/scripts/triggerBootstrap.ts | Adds --fill-constraints flag and passes it into the Temporal bootstrap workflow. |
| services/apps/packages_worker/src/scripts/monitorOsspckgs.ts | Enhances job list display (step labels, stuck highlighting, scrolling, truncation behavior, ecosystem extraction). |
| services/apps/packages_worker/src/scripts/exportToBucket.ts | Adjusts deps export SQL generation to use the new deps SQL API + default ecosystems. |
| services/apps/packages_worker/src/scripts/dedupPackageDeps.ts | New standalone script to dedup cross-chunk duplicates and rebuild the UNIQUE constraint on package_dependencies. |
| services/apps/packages_worker/src/scorecard/workflows/ingestScorecard.ts | Updates merge SQL to use ordered row locking to avoid deadlocks with concurrent repos updates. |
| services/apps/packages_worker/src/deps-dev/workflows/ingestVersions.ts | Defers “done” until after index/constraint rebuild, adds step tracking, and tweaks BQ max-bytes for full loads. |
| services/apps/packages_worker/src/deps-dev/workflows/ingestRepos.ts | Propagates ecosystems context into export activity metadata. |
| services/apps/packages_worker/src/deps-dev/workflows/ingestPackages.ts | Propagates ecosystems context into export activity metadata. |
| services/apps/packages_worker/src/deps-dev/workflows/ingestDependentCounts.ts | Adds step tracking before guard checks. |
| services/apps/packages_worker/src/deps-dev/workflows/ingestDependencies.ts | Adds --fill-constraints mode, extends timeouts, defers finalization for full-load rebuild phases, and adds step tracking. |
| services/apps/packages_worker/src/deps-dev/workflows/ingestAdvisories.ts | Adjusts BQ max-bytes and propagates ecosystems context into export activity metadata. |
| services/apps/packages_worker/src/deps-dev/workflows/bootstrapOsspckgs.ts | Wires fillConstraints option through the top-level bootstrap workflow. |
| services/apps/packages_worker/src/deps-dev/queries/depsSql.ts | Refactors deps SQL generation to support GO/NUGET ecosystem-specific tables + new full/incremental builders. |
| services/apps/packages_worker/src/deps-dev/activities/setJobStep.ts | New activity to write step state into ingest job metadata (meta:step). |
| services/apps/packages_worker/src/deps-dev/activities/manageVersionsIndexes.ts | Builds secondary indexes in parallel (per-connection) and performs dedup + UNIQUE rebuild. |
| services/apps/packages_worker/src/deps-dev/activities/managePackageDepsIndexes.ts | Builds secondary indexes in parallel and runs parallel partitioned dedup to speed up UNIQUE rebuild. |
| services/apps/packages_worker/src/deps-dev/activities/index.ts | Exports the new setJobStep activity. |
| services/apps/packages_worker/src/deps-dev/activities/bqExportToGcs.ts | Adds optional ecosystems metadata to ingest-job table_row_counts for monitor display/diagnostics. |
| services/apps/packages_worker/src/criticality/activities.ts | Marks the ingest job status as merging before ranking merge. |
| services/apps/packages_worker/package.json | Adds dedup-package-deps scripts (and local variant). |
| scripts/cli | Adds deploy workflow-run monitoring via gh run watch and conclusion reporting. |
| docs/adr/README.md | Updates ADR-0003 title/summary to reflect GO + NUGET trigger condition. |
| docs/adr/0003-deps-bq-table-selection.md | Updates ADR-0003 wording to include GO as well as NUGET in the decision/trigger language. |
Copilot's findings
- Files reviewed: 24/24 changed files
- Comments generated: 4
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
⚠️ Not ready to approve
The new dedup-package-deps recovery script can hang or silently no-op with invalid/missing --concurrency, and there are correctness/consistency fixes needed in the new DAL JSONB merge helper and the versions-index rebuild performance changes.
Copilot's findings
- Files reviewed: 24/24 changed files
- Comments generated: 3
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
Signed-off-by: Uroš Marolt <uros@marolt.me>
There was a problem hiding this comment.
⚠️ Human review recommended
It changes core billion-row ingest/index/constraint rebuild behavior and includes at least one confirmed SQL/DB execution bug (transaction-scoped SET LOCAL) that should be fixed and revalidated before approval.
Copilot's findings
- Files reviewed: 24/24 changed files
- Comments generated: 2
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
Signed-off-by: Uroš Marolt <uros@marolt.me>
There was a problem hiding this comment.
⚠️ Human review recommended
It changes multiple high-risk billion-row ingestion and Postgres maintenance paths and still has open correctness/operability concerns (job metadata consistency, fill-constraints update semantics, and memory-pressure tuning for parallel maintenance).
Copilot's findings
- Files reviewed: 24/24 changed files
- Comments generated: 6
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
There was a problem hiding this comment.
⚠️ Human review recommended
It changes high-impact ingestion/DDL behavior at 1B-row scale and includes at least one correctness guard that should be addressed before merge.
Copilot's findings
- Files reviewed: 24/24 changed files
- Comments generated: 1
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
Signed-off-by: Uroš Marolt <uros@marolt.me>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 26f0680. Configure here.
There was a problem hiding this comment.
⚠️ Human review recommended
Changes alter production-scale ingestion/maintenance behavior (parallel index/dedup, job finalization semantics, and new fill mode) and include operational-risk knobs that should be validated and tuned via follow-up fixes.
Copilot's findings
- Files reviewed: 24/24 changed files
- Comments generated: 5
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Summary
Fixes several issues discovered during the first full bootstrap of package_dependencies at scale (~1B rows): a deadlock risk in index management, premature job completion before indexes finished rebuilding, monitor display bugs, and missing version_constraint data for ecosystems loaded via the
cheaper BQ table (option B). Also adds workflow monitoring to the deploy scripts and a one-shot dedup script for manual recovery.
Changes
Type of change
Note
High Risk
Changes affect production-scale package dependency loads, constraint rebuilds, and concurrent DB writes on
package_dependencies/repos; mis-timed job completion or dedup could leave tables without constraints or with bad data until recovery scripts run.Overview
Hardens ~1B-row
package_dependenciesfull bootstrap: parallel secondary index builds and 8-way partition dedup (with higherwork_mem), 24h rebuild activity timeouts, and jobs that stay non-terminal until index/constraint rebuild finishes—not on the last merge chunk. Adds--fill-constraints/MERGE_SQL_FILL_CONSTRAINTSto backfillversion_constraintafter Option B loads, plusdedup-package-depsfor manual dedup + UNIQUE rebuild.Deps BQ SQL now unions ecosystem-specific sources: graph/edges tables for NPM/MAVEN/PYPI/CARGO and
GoRequirements*/NuGetRequirements*for GO/NUGET; ecosystem arrays flow through exports asmeta:ecosystems.setJobStep+mergeJobTableRowCountsexpose phase labels inmonitor:osspckgs(stuck-job detection, scroll). Scorecard repo updates use orderedFOR UPDATEto avoid deadlocks with the GitHub enricher. Deploy staging/productiongh run watchafter workflow trigger. ADR-0003 documents GO/NUGET absent from graph tables.Reviewed by Cursor Bugbot for commit 26f0680. Bugbot is set up for automated code reviews on this repo. Configure here.