chore: add SurfSense operations runbook and ops scripts#1387
chore: add SurfSense operations runbook and ops scripts#1387thadavinciminds wants to merge 2 commits into
Conversation
|
Someone is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughAdds a day-2 operations package: Docker Compose deployment, health-check script, DB backup/restore scripts with safety gates, an update automation script using a health gate, an OPERATIONS runbook, and a small Composio SDK call change. ChangesDay-2 Operations Automation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Updated PR description with a clean summary + validation checklist.\n\nCurrent CI status from my side:\n- Vercel: failing due to project authorization requirement (external/integration-side)\n- recurseml/analysis: failed with analysis error (bot-side)\n- CodeRabbit: pending\n\nGitHub Actions runs for this PR are in |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
surfsense/scripts/update-surfsense.sh (1)
22-45:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHealth-check failure path should rollback or print deterministic rollback steps.
With
set -e, a failing health check exits immediately, so completion/rollback guidance isn’t shown and.envstays pinned to the new version.Suggested patch
-cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak" +BACKUP_FILE=".env.pre-update.$(date +%Y%m%d-%H%M%S).bak" +cp .env "$BACKUP_FILE" @@ echo "Running health check..." -bash "$ROOT_DIR/scripts/health-check.sh" +bash "$ROOT_DIR/scripts/health-check.sh" || { + echo "Health check failed. Rolling back .env from $BACKUP_FILE" >&2 + cp "$BACKUP_FILE" .env + docker compose up -d + exit 2 +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@surfsense/scripts/update-surfsense.sh` around lines 22 - 45, The script currently exits on health-check failure and leaves .env pinned to the new version; add a failure handler (trap on ERR or an explicit conditional after health-check) that restores the pre-update .env backup (the file created by cp .env ".env.pre-update.$(date... ).bak" or, if easier, rewrites SURFSENSE_VERSION=$CURRENT_VERSION into .env) and then runs docker compose up -d to rollback services, and always print deterministic rollback instructions that reference CURRENT_VERSION, TARGET_VERSION, and the backup filename pattern; update the block around the health check and the end-of-script messages to ensure the restore/rollback is performed or at minimum printed on failure (use ROOT_DIR/scripts/health-check.sh, CURRENT_VERSION, and the .env.pre-update backup name to locate the code to change).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@surfsense/OPERATIONS.md`:
- Around line 15-25: Remove the host-specific "/home/vitrus/..." path and fix
capitalization and consistency in the Docker commands: replace the absolute path
with a generic instruction like "cd to the repository root", use lowercase
"docker compose" for all commands, and correct the stop and restart examples to
consistent forms (e.g., "docker compose down" and "docker compose restart
backend frontend zero-cache gateway") so the OPERATIONS.md commands are portable
and valid across shells.
In `@surfsense/scripts/backup-postgres.sh`:
- Around line 13-23: The backup file currently inherits the process umask and
may be too permissive; update backup-postgres.sh to ensure OUT_FILE is created
with restrictive permissions by setting a tight umask (e.g., umask 077) before
running the pg_dump pipeline or by explicitly applying chmod 600 to "$OUT_FILE"
immediately after creation (the pipeline that writes to OUT_FILE is the line
using docker compose exec -T db pg_dump ... | gzip -9 > "$OUT_FILE"); ensure any
mkdir -p of BACKUP_DIR preserves safe directory perms as well (or set umask
before mkdir) so BACKUP_DIR and OUT_FILE are both restricted.
In `@surfsense/scripts/health-check.sh`:
- Around line 21-24: The script currently only checks for the docker binary;
change the preflight to verify the compose command/plugin exists by running a
probe for the compose subcommand (e.g. test `docker compose version` or `docker
compose ps` success) and if that fails fallback to checking for the legacy
`docker-compose` binary; update the error message to reflect which compose
option is missing and ensure the early exit still happens when neither `docker
compose` nor `docker-compose` is available so the later `docker compose ps` call
won't fail unexpectedly.
In `@surfsense/scripts/restore-postgres.sh`:
- Around line 26-30: Enable strict failure handling and validate the gzip before
dropping the schema: add set -euo pipefail at the top, run gunzip -t
"$BACKUP_FILE" and exit non-zero if it fails, then run the schema recreation and
restore using psql with ON_ERROR_STOP enabled (e.g., docker compose exec -T db
psql -U "$DB_USER" -d "$DB_NAME" --set=ON_ERROR_STOP=on -c "DROP SCHEMA public
CASCADE; CREATE SCHEMA public;") and similarly pipe the backup into docker
compose exec -T db psql --set=ON_ERROR_STOP=on so any SQL error causes a
non-zero exit; ensure you check and propagate the exit codes after each command.
In `@surfsense/scripts/update-surfsense.sh`:
- Around line 13-34: The script writes TARGET_VERSION verbatim into .env which
allows newlines or malicious characters to corrupt the file; update the logic
around TARGET_VERSION and the inline Python (referencing TARGET_VERSION and the
inline python block) to first sanitize and validate the value (trim
whitespace/newlines, reject values containing newline, =, or shell/escape
characters) and only accept a strict pattern such as semantic versioning (e.g.,
^v?\d+\.\d+\.\d+$) or the literal "latest"; if validation fails, print an error
and exit non‑zero instead of writing; implement the checks either in shell
before invoking the python block or inside the python script (strip and validate
sys.argv[1]) and only perform the .env substitution when validation passes.
---
Outside diff comments:
In `@surfsense/scripts/update-surfsense.sh`:
- Around line 22-45: The script currently exits on health-check failure and
leaves .env pinned to the new version; add a failure handler (trap on ERR or an
explicit conditional after health-check) that restores the pre-update .env
backup (the file created by cp .env ".env.pre-update.$(date... ).bak" or, if
easier, rewrites SURFSENSE_VERSION=$CURRENT_VERSION into .env) and then runs
docker compose up -d to rollback services, and always print deterministic
rollback instructions that reference CURRENT_VERSION, TARGET_VERSION, and the
backup filename pattern; update the block around the health check and the
end-of-script messages to ensure the restore/rollback is performed or at minimum
printed on failure (use ROOT_DIR/scripts/health-check.sh, CURRENT_VERSION, and
the .env.pre-update backup name to locate the code to change).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e287b101-2049-4568-a4d9-30badc160aa4
📒 Files selected for processing (5)
surfsense/OPERATIONS.mdsurfsense/scripts/backup-postgres.shsurfsense/scripts/health-check.shsurfsense/scripts/restore-postgres.shsurfsense/scripts/update-surfsense.sh
| cd /home/vitrus/work/SurfSense/surfsense | ||
|
|
||
| # Start | ||
| docker compose up -d | ||
|
|
||
| # Stop | ||
| Docker compose down | ||
|
|
||
| # Restart only app-tier services (keeps DB/Redis running) | ||
| Docker compose restart backend frontend zero-cache gateway | ||
| ``` |
There was a problem hiding this comment.
Fix non-portable and invalid command examples.
/home/vitrus/... is host-specific, and Docker compose won’t run on Linux shells (docker is lowercase).
Suggested patch
-cd /home/vitrus/work/SurfSense/surfsense
+cd /path/to/SurfSense/surfsense
@@
-Docker compose down
+docker compose down
@@
-Docker compose restart backend frontend zero-cache gateway
+docker compose restart backend frontend zero-cache gateway📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cd /home/vitrus/work/SurfSense/surfsense | |
| # Start | |
| docker compose up -d | |
| # Stop | |
| Docker compose down | |
| # Restart only app-tier services (keeps DB/Redis running) | |
| Docker compose restart backend frontend zero-cache gateway | |
| ``` | |
| cd /path/to/SurfSense/surfsense | |
| # Start | |
| docker compose up -d | |
| # Stop | |
| docker compose down | |
| # Restart only app-tier services (keeps DB/Redis running) | |
| docker compose restart backend frontend zero-cache gateway |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/OPERATIONS.md` around lines 15 - 25, Remove the host-specific
"/home/vitrus/..." path and fix capitalization and consistency in the Docker
commands: replace the absolute path with a generic instruction like "cd to the
repository root", use lowercase "docker compose" for all commands, and correct
the stop and restart examples to consistent forms (e.g., "docker compose down"
and "docker compose restart backend frontend zero-cache gateway") so the
OPERATIONS.md commands are portable and valid across shells.
| BACKUP_DIR="${1:-$ROOT_DIR/backups}" | ||
| mkdir -p "$BACKUP_DIR" | ||
|
|
||
| STAMP="$(date +%Y%m%d-%H%M%S)" | ||
| OUT_FILE="$BACKUP_DIR/surfsense-db-$STAMP.sql.gz" | ||
|
|
||
| echo "Creating Postgres backup: $OUT_FILE" | ||
| docker compose exec -T db pg_dump -U "$DB_USER" "$DB_NAME" | gzip -9 > "$OUT_FILE" | ||
|
|
||
| echo "Backup complete: $OUT_FILE" | ||
| ls -lh "$OUT_FILE" |
There was a problem hiding this comment.
Restrict backup file permissions by default.
DB dumps can include sensitive data; current file mode depends on process umask and may be too permissive.
Suggested patch
DB_USER="${DB_USER:-surfsense}"
DB_NAME="${DB_NAME:-surfsense}"
BACKUP_DIR="${1:-$ROOT_DIR/backups}"
+umask 077
mkdir -p "$BACKUP_DIR"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| BACKUP_DIR="${1:-$ROOT_DIR/backups}" | |
| mkdir -p "$BACKUP_DIR" | |
| STAMP="$(date +%Y%m%d-%H%M%S)" | |
| OUT_FILE="$BACKUP_DIR/surfsense-db-$STAMP.sql.gz" | |
| echo "Creating Postgres backup: $OUT_FILE" | |
| docker compose exec -T db pg_dump -U "$DB_USER" "$DB_NAME" | gzip -9 > "$OUT_FILE" | |
| echo "Backup complete: $OUT_FILE" | |
| ls -lh "$OUT_FILE" | |
| BACKUP_DIR="${1:-$ROOT_DIR/backups}" | |
| umask 077 | |
| mkdir -p "$BACKUP_DIR" | |
| STAMP="$(date +%Y%m%d-%H%M%S)" | |
| OUT_FILE="$BACKUP_DIR/surfsense-db-$STAMP.sql.gz" | |
| echo "Creating Postgres backup: $OUT_FILE" | |
| docker compose exec -T db pg_dump -U "$DB_USER" "$DB_NAME" | gzip -9 > "$OUT_FILE" | |
| echo "Backup complete: $OUT_FILE" | |
| ls -lh "$OUT_FILE" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/scripts/backup-postgres.sh` around lines 13 - 23, The backup file
currently inherits the process umask and may be too permissive; update
backup-postgres.sh to ensure OUT_FILE is created with restrictive permissions by
setting a tight umask (e.g., umask 077) before running the pg_dump pipeline or
by explicitly applying chmod 600 to "$OUT_FILE" immediately after creation (the
pipeline that writes to OUT_FILE is the line using docker compose exec -T db
pg_dump ... | gzip -9 > "$OUT_FILE"); ensure any mkdir -p of BACKUP_DIR
preserves safe directory perms as well (or set umask before mkdir) so BACKUP_DIR
and OUT_FILE are both restricted.
| command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; } | ||
|
|
||
| echo "\n=== Docker Compose Services ===" | ||
| docker compose ps |
There was a problem hiding this comment.
Preflight should verify docker compose availability.
The script checks docker but not the compose subcommand/plugin, so it can fail later with a less clear error path.
Suggested patch
command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; }
+docker compose version >/dev/null 2>&1 || { echo "[ERROR] docker compose not available" >&2; exit 1; }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; } | |
| echo "\n=== Docker Compose Services ===" | |
| docker compose ps | |
| command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; } | |
| docker compose version >/dev/null 2>&1 || { echo "[ERROR] docker compose not available" >&2; exit 1; } | |
| echo "\n=== Docker Compose Services ===" | |
| docker compose ps |
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 23-23: echo may not expand escape sequences. Use printf.
(SC2028)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/scripts/health-check.sh` around lines 21 - 24, The script currently
only checks for the docker binary; change the preflight to verify the compose
command/plugin exists by running a probe for the compose subcommand (e.g. test
`docker compose version` or `docker compose ps` success) and if that fails
fallback to checking for the legacy `docker-compose` binary; update the error
message to reflect which compose option is missing and ensure the early exit
still happens when neither `docker compose` nor `docker-compose` is available so
the later `docker compose ps` call won't fail unexpectedly.
| echo "Recreating public schema..." | ||
| docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;" | ||
|
|
||
| echo "Restoring from $BACKUP_FILE ..." | ||
| gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For PostgreSQL psql CLI: what is the default error behavior in non-interactive mode, and how does -v ON_ERROR_STOP=1 affect exit codes?
💡 Result:
Default error behavior (non-interactive mode): - By default, psql continues processing after an error in non-interactive scripts (i.e., it does not stop immediately). [1] - psql’s exit status categories include: exit code 0 on normal completion; 1 for a fatal error of its own (e.g., out of memory, file not found); 2 if the connection to the server went bad and the session was not interactive; and 3 if an error occurred in a script and ON_ERROR_STOP was set. [1] How -v ON_ERROR_STOP=1 affects exit codes: - -v ON_ERROR_STOP=1 (or on) makes psql stop immediately on the first error while processing scripts in non-interactive mode. [2][1] - When this happens in a non-interactive script run via -f (or other script execution), psql exits with error code 3 (to distinguish this from fatal errors, which use code 1). [2][1]
Citations:
- 1: https://www.postgresql.org/docs/current/app-psql.html
- 2: http://www.postgresql.org/docs/9.5/app-psql.html
🏁 Script executed:
find . -name "restore-postgres.sh" -type f 2>/dev/nullRepository: MODSetter/SurfSense
Length of output: 103
🏁 Script executed:
cat -n ./surfsense/scripts/restore-postgres.shRepository: MODSetter/SurfSense
Length of output: 1200
Add error-stopping to psql and validate gzip before restoration.
The script's set -e only catches command exit codes, not SQL errors. By default, psql continues processing and exits 0 even after SQL failures. If the DROP SCHEMA or restore query fails, the script will report success.
Additionally, validate the gzip archive's integrity before dropping the schema, as a corrupted file could fail partway through the pipe without stopping the process.
Suggested patch
+gunzip -t "$BACKUP_FILE"
+
echo "Recreating public schema..."
-docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
+docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
echo "Restoring from $BACKUP_FILE ..."
-gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME"
+gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "Recreating public schema..." | |
| docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;" | |
| echo "Restoring from $BACKUP_FILE ..." | |
| gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" | |
| gunzip -t "$BACKUP_FILE" | |
| echo "Recreating public schema..." | |
| docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;" | |
| echo "Restoring from $BACKUP_FILE ..." | |
| gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/scripts/restore-postgres.sh` around lines 26 - 30, Enable strict
failure handling and validate the gzip before dropping the schema: add set -euo
pipefail at the top, run gunzip -t "$BACKUP_FILE" and exit non-zero if it fails,
then run the schema recreation and restore using psql with ON_ERROR_STOP enabled
(e.g., docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME"
--set=ON_ERROR_STOP=on -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;")
and similarly pipe the backup into docker compose exec -T db psql
--set=ON_ERROR_STOP=on so any SQL error causes a non-zero exit; ensure you check
and propagate the exit codes after each command.
| TARGET_VERSION="$1" | ||
| [[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; } | ||
|
|
||
| CURRENT_VERSION="$(grep -E '^SURFSENSE_VERSION=' .env | cut -d= -f2- || true)" | ||
| CURRENT_VERSION="${CURRENT_VERSION:-latest}" | ||
|
|
||
| echo "Current version: $CURRENT_VERSION" | ||
| echo "Target version: $TARGET_VERSION" | ||
|
|
||
| cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak" | ||
|
|
||
| python3 - <<'PY' "$TARGET_VERSION" | ||
| import pathlib,sys,re | ||
| version=sys.argv[1] | ||
| p=pathlib.Path('.env') | ||
| text=p.read_text() | ||
| if re.search(r'^SURFSENSE_VERSION=.*$', text, flags=re.M): | ||
| text=re.sub(r'^SURFSENSE_VERSION=.*$', f'SURFSENSE_VERSION={version}', text, flags=re.M) | ||
| else: | ||
| text += f'\nSURFSENSE_VERSION={version}\n' | ||
| p.write_text(text) | ||
| PY |
There was a problem hiding this comment.
Validate <version-tag> before writing to .env.
TARGET_VERSION is written verbatim; newline or invalid characters can corrupt .env and alter unrelated settings.
Suggested patch
TARGET_VERSION="$1"
[[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; }
+[[ "$TARGET_VERSION" =~ ^[A-Za-z0-9._-]+$ ]] || {
+ echo "Invalid version-tag: $TARGET_VERSION" >&2
+ exit 1
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| TARGET_VERSION="$1" | |
| [[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; } | |
| CURRENT_VERSION="$(grep -E '^SURFSENSE_VERSION=' .env | cut -d= -f2- || true)" | |
| CURRENT_VERSION="${CURRENT_VERSION:-latest}" | |
| echo "Current version: $CURRENT_VERSION" | |
| echo "Target version: $TARGET_VERSION" | |
| cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak" | |
| python3 - <<'PY' "$TARGET_VERSION" | |
| import pathlib,sys,re | |
| version=sys.argv[1] | |
| p=pathlib.Path('.env') | |
| text=p.read_text() | |
| if re.search(r'^SURFSENSE_VERSION=.*$', text, flags=re.M): | |
| text=re.sub(r'^SURFSENSE_VERSION=.*$', f'SURFSENSE_VERSION={version}', text, flags=re.M) | |
| else: | |
| text += f'\nSURFSENSE_VERSION={version}\n' | |
| p.write_text(text) | |
| PY | |
| TARGET_VERSION="$1" | |
| [[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; } | |
| [[ "$TARGET_VERSION" =~ ^[A-Za-z0-9._-]+$ ]] || { | |
| echo "Invalid version-tag: $TARGET_VERSION" >&2 | |
| exit 1 | |
| } | |
| CURRENT_VERSION="$(grep -E '^SURFSENSE_VERSION=' .env | cut -d= -f2- || true)" | |
| CURRENT_VERSION="${CURRENT_VERSION:-latest}" | |
| echo "Current version: $CURRENT_VERSION" | |
| echo "Target version: $TARGET_VERSION" | |
| cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak" | |
| python3 - <<'PY' "$TARGET_VERSION" | |
| import pathlib,sys,re | |
| version=sys.argv[1] | |
| p=pathlib.Path('.env') | |
| text=p.read_text() | |
| if re.search(r'^SURFSENSE_VERSION=.*$', text, flags=re.M): | |
| text=re.sub(r'^SURFSENSE_VERSION=.*$', f'SURFSENSE_VERSION={version}', text, flags=re.M) | |
| else: | |
| text += f'\nSURFSENSE_VERSION={version}\n' | |
| p.write_text(text) | |
| PY |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/scripts/update-surfsense.sh` around lines 13 - 34, The script
writes TARGET_VERSION verbatim into .env which allows newlines or malicious
characters to corrupt the file; update the logic around TARGET_VERSION and the
inline Python (referencing TARGET_VERSION and the inline python block) to first
sanitize and validate the value (trim whitespace/newlines, reject values
containing newline, =, or shell/escape characters) and only accept a strict
pattern such as semantic versioning (e.g., ^v?\d+\.\d+\.\d+$) or the literal
"latest"; if validation fails, print an error and exit non‑zero instead of
writing; implement the checks either in shell before invoking the python block
or inside the python script (strip and validate sys.argv[1]) and only perform
the .env substitution when validation passes.
|
Parking this PR for now per maintainer direction.\n\nNo immediate action required; we can resume CI/review follow-up later when ready. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@surfsense/docker-compose.yml`:
- Line 14: The Compose file references images from non-authorized registries
(e.g., image: pgvector/pgvector:pg17 and other occurrences for redis,
searxng/searxng, rocirp/zero, nginx); update each service to pull from the
approved production registry (replace image: pgvector/pgvector:pg17 with the
equivalent ghcr.io/... image and do the same for redis, searxng/searxng,
rocirp/zero, nginx) so no service uses non-authorized registries or build
contexts and all image: lines point to the authorized ghcr.io (or other
approved) registry tags.
- Line 56: The docker-compose image entries use an unsafe default tag by falling
back to latest (e.g. image:
ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}); change these
to require or supply a pinned tag instead of defaulting to latest: either remove
the ":-latest" fallback so deployments fail when SURFSENSE_VERSION is unset, or
replace it with a fixed, immutable tag (semantic version or digest) and ensure
your CI/helm/compose pipeline injects SURFSENSE_VERSION; update all occurrences
referencing SURFSENSE_VERSION (and equivalent variables at lines noted) so no
service uses a latest fallback.
- Around line 61-64: The docker-compose service is bind-mounting host files like
"../surfsense_backend/app/config/global_llm_config.yaml" and
"../surfsense_backend/app/services/docling_service.py" into the container which
makes production deployments fragile; replace these bind-mounts by baking the
files into the image (add COPY lines in the relevant Dockerfile to copy
global_llm_config.yaml, document_tasks.py, docling_service.py,
composio_service.py into the container build) and then remove the corresponding
volume entries from docker-compose.yml (also apply the same change for the other
identical host mount entries in the file); alternatively, if these are
runtime-managed configs, switch to a managed volume/secret or config resource
instead of a relative host path.
- Line 20: The compose file currently uses insecure hardcoded fallback values
(e.g., POSTGRES_PASSWORD: ${DB_PASSWORD:-surfsense} and the similar fallbacks
for SEARXNG_SECRET and ZERO_ADMIN_PASSWORD) — change these to hard-fail when the
env var is missing by replacing the default-fallback form (${VAR:-default}) with
the mandatory substitution (${VAR:?error message}) for each variable (e.g.,
POSTGRES_PASSWORD -> ${DB_PASSWORD:?DB_PASSWORD must be set}, SEARXNG_SECRET ->
${SEARXNG_SECRET:?SEARXNG_SECRET must be set}, ZERO_ADMIN_PASSWORD ->
${ZERO_ADMIN_PASSWORD:?ZERO_ADMIN_PASSWORD must be set}) so deployments fail
loudly instead of using insecure defaults.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1368aa19-07c0-401b-8e7c-55a724f675d7
📒 Files selected for processing (2)
surfsense/docker-compose.ymlsurfsense_backend/app/services/composio_service.py
|
|
||
| services: | ||
| db: | ||
| image: pgvector/pgvector:pg17 |
There was a problem hiding this comment.
Use authorized registry images for all production services.
This production Compose file still pulls several images from non-authorized registries (pgvector/pgvector, redis, searxng/searxng, rocicorp/zero, nginx). That violates the production registry policy and should be fixed before release.
As per coding guidelines, "In production Docker Compose files, do not use build contexts or mix build and image; all services must pull images from authorized registries (e.g., use image: ghcr.io/... for every service)."
Also applies to: 31-31, 43-43, 176-176, 224-224
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/docker-compose.yml` at line 14, The Compose file references images
from non-authorized registries (e.g., image: pgvector/pgvector:pg17 and other
occurrences for redis, searxng/searxng, rocirp/zero, nginx); update each service
to pull from the approved production registry (replace image:
pgvector/pgvector:pg17 with the equivalent ghcr.io/... image and do the same for
redis, searxng/searxng, rocirp/zero, nginx) so no service uses non-authorized
registries or build contexts and all image: lines point to the authorized
ghcr.io (or other approved) registry tags.
| - ./postgresql.conf:/etc/postgresql/postgresql.conf:ro | ||
| environment: | ||
| POSTGRES_USER: ${DB_USER:-surfsense} | ||
| POSTGRES_PASSWORD: ${DB_PASSWORD:-surfsense} |
There was a problem hiding this comment.
Remove insecure default secret fallbacks in production env vars.
Using hardcoded fallback secrets (surfsense, surfsense-searxng-secret, surfsense-zero-admin) makes accidental insecure deployments very likely.
Suggested hard-fail pattern
- POSTGRES_PASSWORD: ${DB_PASSWORD:-surfsense}
+ POSTGRES_PASSWORD: ${DB_PASSWORD:?DB_PASSWORD is required}
- SEARXNG_SECRET: ${SEARXNG_SECRET:-surfsense-searxng-secret}
+ SEARXNG_SECRET: ${SEARXNG_SECRET:?SEARXNG_SECRET is required}
- ZERO_ADMIN_PASSWORD: ${ZERO_ADMIN_PASSWORD:-surfsense-zero-admin}
+ ZERO_ADMIN_PASSWORD: ${ZERO_ADMIN_PASSWORD:?ZERO_ADMIN_PASSWORD is required}Also applies to: 47-47, 186-186
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/docker-compose.yml` at line 20, The compose file currently uses
insecure hardcoded fallback values (e.g., POSTGRES_PASSWORD:
${DB_PASSWORD:-surfsense} and the similar fallbacks for SEARXNG_SECRET and
ZERO_ADMIN_PASSWORD) — change these to hard-fail when the env var is missing by
replacing the default-fallback form (${VAR:-default}) with the mandatory
substitution (${VAR:?error message}) for each variable (e.g., POSTGRES_PASSWORD
-> ${DB_PASSWORD:?DB_PASSWORD must be set}, SEARXNG_SECRET ->
${SEARXNG_SECRET:?SEARXNG_SECRET must be set}, ZERO_ADMIN_PASSWORD ->
${ZERO_ADMIN_PASSWORD:?ZERO_ADMIN_PASSWORD must be set}) so deployments fail
loudly instead of using insecure defaults.
| retries: 5 | ||
|
|
||
| backend: | ||
| image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest} |
There was a problem hiding this comment.
Avoid latest as the production image fallback tag.
Falling back to latest makes deploys non-reproducible and weakens rollback guarantees (especially with Watchtower enabled).
Suggested tag hardening
- image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}
+ image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:?SURFSENSE_VERSION is required}
...
- image: ghcr.io/modsetter/surfsense-web:${SURFSENSE_VERSION:-latest}
+ image: ghcr.io/modsetter/surfsense-web:${SURFSENSE_VERSION:?SURFSENSE_VERSION is required}Also applies to: 104-104, 136-136, 206-206
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/docker-compose.yml` at line 56, The docker-compose image entries
use an unsafe default tag by falling back to latest (e.g. image:
ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}); change these
to require or supply a pinned tag instead of defaulting to latest: either remove
the ":-latest" fallback so deployments fail when SURFSENSE_VERSION is unset, or
replace it with a fixed, immutable tag (semantic version or digest) and ensure
your CI/helm/compose pipeline injects SURFSENSE_VERSION; update all occurrences
referencing SURFSENSE_VERSION (and equivalent variables at lines noted) so no
service uses a latest fallback.
| - ../surfsense_backend/app/config/global_llm_config.yaml:/app/app/config/global_llm_config.yaml:ro | ||
| - ../surfsense_backend/app/tasks/celery_tasks/document_tasks.py:/app/app/tasks/celery_tasks/document_tasks.py:ro | ||
| - ../surfsense_backend/app/services/docling_service.py:/app/app/services/docling_service.py:ro | ||
| - ../surfsense_backend/app/services/composio_service.py:/app/app/services/composio_service.py:ro |
There was a problem hiding this comment.
Host source bind-mounts make this production stack fragile.
Mounting ../surfsense_backend/... files into containers couples runtime behavior to host repo layout and can break clean server deploys. For production, bake these assets into the image (or distribute as managed config artifacts) instead of relying on relative host paths.
Also applies to: 107-110, 138-138
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@surfsense/docker-compose.yml` around lines 61 - 64, The docker-compose
service is bind-mounting host files like
"../surfsense_backend/app/config/global_llm_config.yaml" and
"../surfsense_backend/app/services/docling_service.py" into the container which
makes production deployments fragile; replace these bind-mounts by baking the
files into the image (add COPY lines in the relevant Dockerfile to copy
global_llm_config.yaml, document_tasks.py, docling_service.py,
composio_service.py into the container build) and then remove the corresponding
volume entries from docker-compose.yml (also apply the same change for the other
identical host mount entries in the file); alternatively, if these are
runtime-managed configs, switch to a managed volume/secret or config resource
instead of a relative host path.
|
@thadavinciminds Thanks for this really appreciate it. Can you please re-raise this PR on dev branch. |
Summary
Why
This establishes a repeatable operational baseline so self-hosted SurfSense instances are easier to run, troubleshoot, and update safely.
Included files
surfsense/OPERATIONS.mdsurfsense/scripts/health-check.shsurfsense/scripts/backup-postgres.shsurfsense/scripts/restore-postgres.shsurfsense/scripts/update-surfsense.shValidation
bash -n) for all new scripts.health-check.shexecuted against a live stack (all checks passed)..sql.gzartifact.RESTORE).Operator impact
Rollback
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes