-
Notifications
You must be signed in to change notification settings - Fork 1
Refactor CI: pull chroma from Docker image, track SHA via Heroku conf… #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -3,7 +3,7 @@ name: Weekly Index | |||||||||||||||||||||||||||||
| on: | ||||||||||||||||||||||||||||||
| schedule: | ||||||||||||||||||||||||||||||
| - cron: '0 3 * * 1' # Every Monday at 03:00 UTC | ||||||||||||||||||||||||||||||
| workflow_dispatch: # Manual trigger via GitHub UI | ||||||||||||||||||||||||||||||
| workflow_dispatch: | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| jobs: | ||||||||||||||||||||||||||||||
| index: | ||||||||||||||||||||||||||||||
|
|
@@ -27,26 +27,31 @@ jobs: | |||||||||||||||||||||||||||||
| with: | ||||||||||||||||||||||||||||||
| ref: main | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # Add the heroku remote (it doesn't exist in a fresh Actions checkout) | ||||||||||||||||||||||||||||||
| - name: Add Heroku remote | ||||||||||||||||||||||||||||||
| - name: Install Heroku CLI | ||||||||||||||||||||||||||||||
| run: curl https://cli-assets.heroku.com/install.sh | sh | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # Pull chroma/ from the live Docker image so the pipeline runs incrementally | ||||||||||||||||||||||||||||||
| - name: Extract ChromaDB from current Heroku image | ||||||||||||||||||||||||||||||
| env: | ||||||||||||||||||||||||||||||
| HEROKU_API_KEY: ${{ secrets.HEROKU_API_KEY }} | ||||||||||||||||||||||||||||||
| run: | | ||||||||||||||||||||||||||||||
| git remote add heroku "https://heroku:$HEROKU_API_KEY@git.heroku.com/physlibsearch.git" | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # Restore chroma/ and the SHA tracker from the heroku git remote | ||||||||||||||||||||||||||||||
| - name: Restore ChromaDB and SHA tracker | ||||||||||||||||||||||||||||||
| run: | | ||||||||||||||||||||||||||||||
| git fetch heroku main | ||||||||||||||||||||||||||||||
| git checkout heroku/main -- chroma/ .last-physlib-sha 2>/dev/null \ | ||||||||||||||||||||||||||||||
| || echo "No prior index on heroku/main — starting fresh." | ||||||||||||||||||||||||||||||
| heroku container:login | ||||||||||||||||||||||||||||||
| docker pull registry.heroku.com/physlibsearch/web || echo "No existing image — starting fresh." | ||||||||||||||||||||||||||||||
| CID=$(docker create registry.heroku.com/physlibsearch/web 2>/dev/null) || true | ||||||||||||||||||||||||||||||
|
Comment on lines
+39
to
+40
|
||||||||||||||||||||||||||||||
| docker pull registry.heroku.com/physlibsearch/web || echo "No existing image — starting fresh." | |
| CID=$(docker create registry.heroku.com/physlibsearch/web 2>/dev/null) || true | |
| CID="" | |
| PULL_OUTPUT=$(docker pull registry.heroku.com/physlibsearch/web 2>&1) | |
| PULL_STATUS=$? | |
| if [ "$PULL_STATUS" -eq 0 ]; then | |
| echo "$PULL_OUTPUT" | |
| CID=$(docker create registry.heroku.com/physlibsearch/web 2>/dev/null) || true | |
| elif printf '%s\n' "$PULL_OUTPUT" | grep -qiE 'manifest unknown|not found'; then | |
| echo "No existing image — starting fresh." | |
| else | |
| echo "$PULL_OUTPUT" | |
| exit "$PULL_STATUS" | |
| fi |
Copilot
AI
Apr 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ChromaDB extraction step runs unconditionally, even when PhysLib hasn't changed and the rest of the job is skipped. This adds a potentially large docker pull to every scheduled run; consider moving the PhysLib change check earlier (or gating the extraction step on has_changes) so the workflow can exit quickly when there's nothing to update.
Copilot
AI
Apr 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CURRENT_SHA=$(git ls-remote ... | cut -f1) can silently produce an empty SHA if git ls-remote fails, because without pipefail the pipeline exit status is from cut. That can make the workflow treat a transient network failure as a change and later persist an empty/incorrect LAST_PHYSLIB_SHA. Add set -o pipefail (or avoid the pipeline) and explicitly fail if CURRENT_SHA is empty.
| run: | | |
| CURRENT_SHA=$(git ls-remote "$PHYSLIB_REPO" HEAD | cut -f1) | |
| run: | | |
| set -o pipefail | |
| CURRENT_SHA=$(git ls-remote "$PHYSLIB_REPO" HEAD | cut -f1) | |
| if [ -z "$CURRENT_SHA" ]; then | |
| echo "Failed to resolve current PhysLib HEAD SHA from $PHYSLIB_REPO" >&2 | |
| exit 1 | |
| fi |
Copilot
AI
Apr 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heroku config:get ... 2>/dev/null || echo "" suppresses all errors (including auth/permission issues) and treats them as "no prior SHA", which can trigger an unnecessary full reindex and then overwrite the stored SHA. Consider only defaulting to empty when the config var is genuinely missing, and otherwise failing the job so misconfiguration doesn't silently reset state.
| LAST_SHA=$(heroku config:get LAST_PHYSLIB_SHA --app physlibsearch 2>/dev/null || echo "") | |
| LAST_SHA_OUTPUT=$(heroku config:get LAST_PHYSLIB_SHA --app physlibsearch 2>/dev/null) | |
| HEROKU_CONFIG_GET_STATUS=$? | |
| if [ "$HEROKU_CONFIG_GET_STATUS" -ne 0 ]; then | |
| echo "Failed to read LAST_PHYSLIB_SHA from Heroku app physlibsearch." >&2 | |
| exit "$HEROKU_CONFIG_GET_STATUS" | |
| fi | |
| LAST_SHA=$LAST_SHA_OUTPUT |
Copilot
AI
Apr 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Heroku app name (physlibsearch) is hard-coded in several commands (registry path, config:get, container:*). To reduce duplication and the chance of inconsistent edits later, consider defining it once as an env var (e.g., HEROKU_APP) and reusing it across steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Installing the Heroku CLI via
curl ... | shexecutes a remote script without verification, which is a supply-chain risk. Prefer a pinned install method (e.g., package manager with version pinning, or downloading a specific release artifact and verifying its checksum/signature).