Skip to content

chore: add SurfSense operations runbook and ops scripts#1387

Closed
thadavinciminds wants to merge 2 commits into
MODSetter:mainfrom
thadavinciminds:chore/surfsense-ops-runbook
Closed

chore: add SurfSense operations runbook and ops scripts#1387
thadavinciminds wants to merge 2 commits into
MODSetter:mainfrom
thadavinciminds:chore/surfsense-ops-runbook

Conversation

@thadavinciminds
Copy link
Copy Markdown

@thadavinciminds thadavinciminds commented May 12, 2026

Summary

  • Add a practical operations runbook for day-2 SurfSense management.
  • Add a one-command health check script for container + HTTP readiness.
  • Add Postgres backup/restore scripts for safer maintenance.
  • Add a safe update script that enforces version pinning and post-update validation.

Why

This establishes a repeatable operational baseline so self-hosted SurfSense instances are easier to run, troubleshoot, and update safely.

Included files

  • surfsense/OPERATIONS.md
  • surfsense/scripts/health-check.sh
  • surfsense/scripts/backup-postgres.sh
  • surfsense/scripts/restore-postgres.sh
  • surfsense/scripts/update-surfsense.sh

Validation

  • Script syntax checked (bash -n) for all new scripts.
  • health-check.sh executed against a live stack (all checks passed).
  • Backup script verified to produce timestamped .sql.gz artifact.
  • Restore script includes explicit destructive confirmation gate (RESTORE).
  • Update script validates tag input, backs up env, redeploys, and runs health checks.

Operator impact

  • No application runtime code paths changed.
  • Adds operational tooling + documentation only.

Rollback

  • Revert this PR (scripts/runbook only), or restore previous operational workflow.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added complete Docker-based self-hosted deployment stack with production-ready configuration
    • Introduced automated health check script to verify service status and configuration
    • Added database backup and restore functionality for disaster recovery
    • Implemented one-command update workflow with automatic health verification and rollback support
    • Included comprehensive operational runbook for day-2 operations
  • Bug Fixes

    • Updated OAuth connection handling to align with current SDK version

Review Change Stack

@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

Someone is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

📝 Walkthrough

Walkthrough

Adds a day-2 operations package: Docker Compose deployment, health-check script, DB backup/restore scripts with safety gates, an update automation script using a health gate, an OPERATIONS runbook, and a small Composio SDK call change.

Changes

Day-2 Operations Automation

Layer / File(s) Summary
Docker Compose deployment
surfsense/docker-compose.yml
Declares services (db, redis, searxng, backend, celery_worker, celery_beat, zero-cache, frontend, gateway) and named volumes with healthchecks, environment defaults, mounts, and startup gating.
Health Check Framework
surfsense/scripts/health-check.sh
Verifies Docker availability, prints compose service status, performs curl probes for frontend/backend/zero-cache (optionally searxng), inspects container health, and exits 0 on success or 2 on failure for automation gates.
Database Backup and Restore Operations
surfsense/scripts/backup-postgres.sh, surfsense/scripts/restore-postgres.sh
backup-postgres.sh produces timestamped compressed PostgreSQL dumps via `docker compose exec pg_dump
Service Update Workflow
surfsense/scripts/update-surfsense.sh
Validates target version argument, backs up .env, updates SURFSENSE_VERSION via inline Python (replace-or-append), runs docker compose pull + docker compose up -d, executes scripts/health-check.sh, and prints rollback instructions referencing the .env backup.
Comprehensive Operations Runbook
surfsense/OPERATIONS.md
Documents conventions, lifecycle commands (start/stop/app-tier-only restart), a one-command health check, logging/diagnostics, backup/restore steps, safe update and rollback strategy, security baseline (secret rotation, worker limits), and operator cadence.
Composio SDK change
surfsense_backend/app/services/composio_service.py
Replaces connected_accounts.initiate(..., allow_multiple=True) with connected_accounts.link(...) when creating managed OAuth connections and adapts to the returned connection_request fields.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

A rabbit hops through ops so keen,
I script the backups, checks, and scene,
Compose brings services into view,
Health gates guard updates through and through,
Restore says "type RESTORE" — then we're green! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding operational documentation (OPERATIONS.md) and operational scripts (health-check, backup, restore, update utilities) to support day-2 operations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@thadavinciminds
Copy link
Copy Markdown
Author

Updated PR description with a clean summary + validation checklist.\n\nCurrent CI status from my side:\n- Vercel: failing due to project authorization requirement (external/integration-side)\n- recurseml/analysis: failed with analysis error (bot-side)\n- CodeRabbit: pending\n\nGitHub Actions runs for this PR are in action_required state with no job logs visible from contributor side, which usually means a maintainer needs to approve fork-based workflow execution before they can run.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
surfsense/scripts/update-surfsense.sh (1)

22-45: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Health-check failure path should rollback or print deterministic rollback steps.

With set -e, a failing health check exits immediately, so completion/rollback guidance isn’t shown and .env stays pinned to the new version.

Suggested patch
-cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak"
+BACKUP_FILE=".env.pre-update.$(date +%Y%m%d-%H%M%S).bak"
+cp .env "$BACKUP_FILE"
@@
 echo "Running health check..."
-bash "$ROOT_DIR/scripts/health-check.sh"
+bash "$ROOT_DIR/scripts/health-check.sh" || {
+  echo "Health check failed. Rolling back .env from $BACKUP_FILE" >&2
+  cp "$BACKUP_FILE" .env
+  docker compose up -d
+  exit 2
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/scripts/update-surfsense.sh` around lines 22 - 45, The script
currently exits on health-check failure and leaves .env pinned to the new
version; add a failure handler (trap on ERR or an explicit conditional after
health-check) that restores the pre-update .env backup (the file created by cp
.env ".env.pre-update.$(date... ).bak" or, if easier, rewrites
SURFSENSE_VERSION=$CURRENT_VERSION into .env) and then runs docker compose up -d
to rollback services, and always print deterministic rollback instructions that
reference CURRENT_VERSION, TARGET_VERSION, and the backup filename pattern;
update the block around the health check and the end-of-script messages to
ensure the restore/rollback is performed or at minimum printed on failure (use
ROOT_DIR/scripts/health-check.sh, CURRENT_VERSION, and the .env.pre-update
backup name to locate the code to change).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@surfsense/OPERATIONS.md`:
- Around line 15-25: Remove the host-specific "/home/vitrus/..." path and fix
capitalization and consistency in the Docker commands: replace the absolute path
with a generic instruction like "cd to the repository root", use lowercase
"docker compose" for all commands, and correct the stop and restart examples to
consistent forms (e.g., "docker compose down" and "docker compose restart
backend frontend zero-cache gateway") so the OPERATIONS.md commands are portable
and valid across shells.

In `@surfsense/scripts/backup-postgres.sh`:
- Around line 13-23: The backup file currently inherits the process umask and
may be too permissive; update backup-postgres.sh to ensure OUT_FILE is created
with restrictive permissions by setting a tight umask (e.g., umask 077) before
running the pg_dump pipeline or by explicitly applying chmod 600 to "$OUT_FILE"
immediately after creation (the pipeline that writes to OUT_FILE is the line
using docker compose exec -T db pg_dump ... | gzip -9 > "$OUT_FILE"); ensure any
mkdir -p of BACKUP_DIR preserves safe directory perms as well (or set umask
before mkdir) so BACKUP_DIR and OUT_FILE are both restricted.

In `@surfsense/scripts/health-check.sh`:
- Around line 21-24: The script currently only checks for the docker binary;
change the preflight to verify the compose command/plugin exists by running a
probe for the compose subcommand (e.g. test `docker compose version` or `docker
compose ps` success) and if that fails fallback to checking for the legacy
`docker-compose` binary; update the error message to reflect which compose
option is missing and ensure the early exit still happens when neither `docker
compose` nor `docker-compose` is available so the later `docker compose ps` call
won't fail unexpectedly.

In `@surfsense/scripts/restore-postgres.sh`:
- Around line 26-30: Enable strict failure handling and validate the gzip before
dropping the schema: add set -euo pipefail at the top, run gunzip -t
"$BACKUP_FILE" and exit non-zero if it fails, then run the schema recreation and
restore using psql with ON_ERROR_STOP enabled (e.g., docker compose exec -T db
psql -U "$DB_USER" -d "$DB_NAME" --set=ON_ERROR_STOP=on -c "DROP SCHEMA public
CASCADE; CREATE SCHEMA public;") and similarly pipe the backup into docker
compose exec -T db psql --set=ON_ERROR_STOP=on so any SQL error causes a
non-zero exit; ensure you check and propagate the exit codes after each command.

In `@surfsense/scripts/update-surfsense.sh`:
- Around line 13-34: The script writes TARGET_VERSION verbatim into .env which
allows newlines or malicious characters to corrupt the file; update the logic
around TARGET_VERSION and the inline Python (referencing TARGET_VERSION and the
inline python block) to first sanitize and validate the value (trim
whitespace/newlines, reject values containing newline, =, or shell/escape
characters) and only accept a strict pattern such as semantic versioning (e.g.,
^v?\d+\.\d+\.\d+$) or the literal "latest"; if validation fails, print an error
and exit non‑zero instead of writing; implement the checks either in shell
before invoking the python block or inside the python script (strip and validate
sys.argv[1]) and only perform the .env substitution when validation passes.

---

Outside diff comments:
In `@surfsense/scripts/update-surfsense.sh`:
- Around line 22-45: The script currently exits on health-check failure and
leaves .env pinned to the new version; add a failure handler (trap on ERR or an
explicit conditional after health-check) that restores the pre-update .env
backup (the file created by cp .env ".env.pre-update.$(date... ).bak" or, if
easier, rewrites SURFSENSE_VERSION=$CURRENT_VERSION into .env) and then runs
docker compose up -d to rollback services, and always print deterministic
rollback instructions that reference CURRENT_VERSION, TARGET_VERSION, and the
backup filename pattern; update the block around the health check and the
end-of-script messages to ensure the restore/rollback is performed or at minimum
printed on failure (use ROOT_DIR/scripts/health-check.sh, CURRENT_VERSION, and
the .env.pre-update backup name to locate the code to change).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e287b101-2049-4568-a4d9-30badc160aa4

📥 Commits

Reviewing files that changed from the base of the PR and between d6618b8 and 7d249e7.

📒 Files selected for processing (5)
  • surfsense/OPERATIONS.md
  • surfsense/scripts/backup-postgres.sh
  • surfsense/scripts/health-check.sh
  • surfsense/scripts/restore-postgres.sh
  • surfsense/scripts/update-surfsense.sh

Comment thread surfsense/OPERATIONS.md
Comment on lines +15 to +25
cd /home/vitrus/work/SurfSense/surfsense

# Start
docker compose up -d

# Stop
Docker compose down

# Restart only app-tier services (keeps DB/Redis running)
Docker compose restart backend frontend zero-cache gateway
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix non-portable and invalid command examples.

/home/vitrus/... is host-specific, and Docker compose won’t run on Linux shells (docker is lowercase).

Suggested patch
-cd /home/vitrus/work/SurfSense/surfsense
+cd /path/to/SurfSense/surfsense
@@
-Docker compose down
+docker compose down
@@
-Docker compose restart backend frontend zero-cache gateway
+docker compose restart backend frontend zero-cache gateway
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cd /home/vitrus/work/SurfSense/surfsense
# Start
docker compose up -d
# Stop
Docker compose down
# Restart only app-tier services (keeps DB/Redis running)
Docker compose restart backend frontend zero-cache gateway
```
cd /path/to/SurfSense/surfsense
# Start
docker compose up -d
# Stop
docker compose down
# Restart only app-tier services (keeps DB/Redis running)
docker compose restart backend frontend zero-cache gateway
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/OPERATIONS.md` around lines 15 - 25, Remove the host-specific
"/home/vitrus/..." path and fix capitalization and consistency in the Docker
commands: replace the absolute path with a generic instruction like "cd to the
repository root", use lowercase "docker compose" for all commands, and correct
the stop and restart examples to consistent forms (e.g., "docker compose down"
and "docker compose restart backend frontend zero-cache gateway") so the
OPERATIONS.md commands are portable and valid across shells.

Comment on lines +13 to +23
BACKUP_DIR="${1:-$ROOT_DIR/backups}"
mkdir -p "$BACKUP_DIR"

STAMP="$(date +%Y%m%d-%H%M%S)"
OUT_FILE="$BACKUP_DIR/surfsense-db-$STAMP.sql.gz"

echo "Creating Postgres backup: $OUT_FILE"
docker compose exec -T db pg_dump -U "$DB_USER" "$DB_NAME" | gzip -9 > "$OUT_FILE"

echo "Backup complete: $OUT_FILE"
ls -lh "$OUT_FILE"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict backup file permissions by default.

DB dumps can include sensitive data; current file mode depends on process umask and may be too permissive.

Suggested patch
 DB_USER="${DB_USER:-surfsense}"
 DB_NAME="${DB_NAME:-surfsense}"
 BACKUP_DIR="${1:-$ROOT_DIR/backups}"
+umask 077
 mkdir -p "$BACKUP_DIR"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BACKUP_DIR="${1:-$ROOT_DIR/backups}"
mkdir -p "$BACKUP_DIR"
STAMP="$(date +%Y%m%d-%H%M%S)"
OUT_FILE="$BACKUP_DIR/surfsense-db-$STAMP.sql.gz"
echo "Creating Postgres backup: $OUT_FILE"
docker compose exec -T db pg_dump -U "$DB_USER" "$DB_NAME" | gzip -9 > "$OUT_FILE"
echo "Backup complete: $OUT_FILE"
ls -lh "$OUT_FILE"
BACKUP_DIR="${1:-$ROOT_DIR/backups}"
umask 077
mkdir -p "$BACKUP_DIR"
STAMP="$(date +%Y%m%d-%H%M%S)"
OUT_FILE="$BACKUP_DIR/surfsense-db-$STAMP.sql.gz"
echo "Creating Postgres backup: $OUT_FILE"
docker compose exec -T db pg_dump -U "$DB_USER" "$DB_NAME" | gzip -9 > "$OUT_FILE"
echo "Backup complete: $OUT_FILE"
ls -lh "$OUT_FILE"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/scripts/backup-postgres.sh` around lines 13 - 23, The backup file
currently inherits the process umask and may be too permissive; update
backup-postgres.sh to ensure OUT_FILE is created with restrictive permissions by
setting a tight umask (e.g., umask 077) before running the pg_dump pipeline or
by explicitly applying chmod 600 to "$OUT_FILE" immediately after creation (the
pipeline that writes to OUT_FILE is the line using docker compose exec -T db
pg_dump ... | gzip -9 > "$OUT_FILE"); ensure any mkdir -p of BACKUP_DIR
preserves safe directory perms as well (or set umask before mkdir) so BACKUP_DIR
and OUT_FILE are both restricted.

Comment on lines +21 to +24
command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; }

echo "\n=== Docker Compose Services ==="
docker compose ps
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preflight should verify docker compose availability.

The script checks docker but not the compose subcommand/plugin, so it can fail later with a less clear error path.

Suggested patch
 command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; }
+docker compose version >/dev/null 2>&1 || { echo "[ERROR] docker compose not available" >&2; exit 1; }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; }
echo "\n=== Docker Compose Services ==="
docker compose ps
command -v docker >/dev/null 2>&1 || { echo "[ERROR] docker not installed" >&2; exit 1; }
docker compose version >/dev/null 2>&1 || { echo "[ERROR] docker compose not available" >&2; exit 1; }
echo "\n=== Docker Compose Services ==="
docker compose ps
🧰 Tools
🪛 Shellcheck (0.11.0)

[info] 23-23: echo may not expand escape sequences. Use printf.

(SC2028)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/scripts/health-check.sh` around lines 21 - 24, The script currently
only checks for the docker binary; change the preflight to verify the compose
command/plugin exists by running a probe for the compose subcommand (e.g. test
`docker compose version` or `docker compose ps` success) and if that fails
fallback to checking for the legacy `docker-compose` binary; update the error
message to reflect which compose option is missing and ensure the early exit
still happens when neither `docker compose` nor `docker-compose` is available so
the later `docker compose ps` call won't fail unexpectedly.

Comment on lines +26 to +30
echo "Recreating public schema..."
docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"

echo "Restoring from $BACKUP_FILE ..."
gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

For PostgreSQL psql CLI: what is the default error behavior in non-interactive mode, and how does -v ON_ERROR_STOP=1 affect exit codes?

💡 Result:

Default error behavior (non-interactive mode): - By default, psql continues processing after an error in non-interactive scripts (i.e., it does not stop immediately). [1] - psql’s exit status categories include: exit code 0 on normal completion; 1 for a fatal error of its own (e.g., out of memory, file not found); 2 if the connection to the server went bad and the session was not interactive; and 3 if an error occurred in a script and ON_ERROR_STOP was set. [1] How -v ON_ERROR_STOP=1 affects exit codes: - -v ON_ERROR_STOP=1 (or on) makes psql stop immediately on the first error while processing scripts in non-interactive mode. [2][1] - When this happens in a non-interactive script run via -f (or other script execution), psql exits with error code 3 (to distinguish this from fatal errors, which use code 1). [2][1]

Citations:


🏁 Script executed:

find . -name "restore-postgres.sh" -type f 2>/dev/null

Repository: MODSetter/SurfSense

Length of output: 103


🏁 Script executed:

cat -n ./surfsense/scripts/restore-postgres.sh

Repository: MODSetter/SurfSense

Length of output: 1200


Add error-stopping to psql and validate gzip before restoration.

The script's set -e only catches command exit codes, not SQL errors. By default, psql continues processing and exits 0 even after SQL failures. If the DROP SCHEMA or restore query fails, the script will report success.

Additionally, validate the gzip archive's integrity before dropping the schema, as a corrupted file could fail partway through the pipe without stopping the process.

Suggested patch
+gunzip -t "$BACKUP_FILE"
+
 echo "Recreating public schema..."
-docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
+docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
 
 echo "Restoring from $BACKUP_FILE ..."
-gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME"
+gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo "Recreating public schema..."
docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
echo "Restoring from $BACKUP_FILE ..."
gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME"
gunzip -t "$BACKUP_FILE"
echo "Recreating public schema..."
docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
echo "Restoring from $BACKUP_FILE ..."
gunzip -c "$BACKUP_FILE" | docker compose exec -T db psql -v ON_ERROR_STOP=1 -U "$DB_USER" -d "$DB_NAME"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/scripts/restore-postgres.sh` around lines 26 - 30, Enable strict
failure handling and validate the gzip before dropping the schema: add set -euo
pipefail at the top, run gunzip -t "$BACKUP_FILE" and exit non-zero if it fails,
then run the schema recreation and restore using psql with ON_ERROR_STOP enabled
(e.g., docker compose exec -T db psql -U "$DB_USER" -d "$DB_NAME"
--set=ON_ERROR_STOP=on -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;")
and similarly pipe the backup into docker compose exec -T db psql
--set=ON_ERROR_STOP=on so any SQL error causes a non-zero exit; ensure you check
and propagate the exit codes after each command.

Comment on lines +13 to +34
TARGET_VERSION="$1"
[[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; }

CURRENT_VERSION="$(grep -E '^SURFSENSE_VERSION=' .env | cut -d= -f2- || true)"
CURRENT_VERSION="${CURRENT_VERSION:-latest}"

echo "Current version: $CURRENT_VERSION"
echo "Target version: $TARGET_VERSION"

cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak"

python3 - <<'PY' "$TARGET_VERSION"
import pathlib,sys,re
version=sys.argv[1]
p=pathlib.Path('.env')
text=p.read_text()
if re.search(r'^SURFSENSE_VERSION=.*$', text, flags=re.M):
text=re.sub(r'^SURFSENSE_VERSION=.*$', f'SURFSENSE_VERSION={version}', text, flags=re.M)
else:
text += f'\nSURFSENSE_VERSION={version}\n'
p.write_text(text)
PY
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate <version-tag> before writing to .env.

TARGET_VERSION is written verbatim; newline or invalid characters can corrupt .env and alter unrelated settings.

Suggested patch
 TARGET_VERSION="$1"
 [[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; }
+[[ "$TARGET_VERSION" =~ ^[A-Za-z0-9._-]+$ ]] || {
+  echo "Invalid version-tag: $TARGET_VERSION" >&2
+  exit 1
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TARGET_VERSION="$1"
[[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; }
CURRENT_VERSION="$(grep -E '^SURFSENSE_VERSION=' .env | cut -d= -f2- || true)"
CURRENT_VERSION="${CURRENT_VERSION:-latest}"
echo "Current version: $CURRENT_VERSION"
echo "Target version: $TARGET_VERSION"
cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak"
python3 - <<'PY' "$TARGET_VERSION"
import pathlib,sys,re
version=sys.argv[1]
p=pathlib.Path('.env')
text=p.read_text()
if re.search(r'^SURFSENSE_VERSION=.*$', text, flags=re.M):
text=re.sub(r'^SURFSENSE_VERSION=.*$', f'SURFSENSE_VERSION={version}', text, flags=re.M)
else:
text += f'\nSURFSENSE_VERSION={version}\n'
p.write_text(text)
PY
TARGET_VERSION="$1"
[[ -f .env ]] || { echo ".env missing in $ROOT_DIR" >&2; exit 1; }
[[ "$TARGET_VERSION" =~ ^[A-Za-z0-9._-]+$ ]] || {
echo "Invalid version-tag: $TARGET_VERSION" >&2
exit 1
}
CURRENT_VERSION="$(grep -E '^SURFSENSE_VERSION=' .env | cut -d= -f2- || true)"
CURRENT_VERSION="${CURRENT_VERSION:-latest}"
echo "Current version: $CURRENT_VERSION"
echo "Target version: $TARGET_VERSION"
cp .env ".env.pre-update.$(date +%Y%m%d-%H%M%S).bak"
python3 - <<'PY' "$TARGET_VERSION"
import pathlib,sys,re
version=sys.argv[1]
p=pathlib.Path('.env')
text=p.read_text()
if re.search(r'^SURFSENSE_VERSION=.*$', text, flags=re.M):
text=re.sub(r'^SURFSENSE_VERSION=.*$', f'SURFSENSE_VERSION={version}', text, flags=re.M)
else:
text += f'\nSURFSENSE_VERSION={version}\n'
p.write_text(text)
PY
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/scripts/update-surfsense.sh` around lines 13 - 34, The script
writes TARGET_VERSION verbatim into .env which allows newlines or malicious
characters to corrupt the file; update the logic around TARGET_VERSION and the
inline Python (referencing TARGET_VERSION and the inline python block) to first
sanitize and validate the value (trim whitespace/newlines, reject values
containing newline, =, or shell/escape characters) and only accept a strict
pattern such as semantic versioning (e.g., ^v?\d+\.\d+\.\d+$) or the literal
"latest"; if validation fails, print an error and exit non‑zero instead of
writing; implement the checks either in shell before invoking the python block
or inside the python script (strip and validate sys.argv[1]) and only perform
the .env substitution when validation passes.

@thadavinciminds
Copy link
Copy Markdown
Author

Parking this PR for now per maintainer direction.\n\nNo immediate action required; we can resume CI/review follow-up later when ready.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@surfsense/docker-compose.yml`:
- Line 14: The Compose file references images from non-authorized registries
(e.g., image: pgvector/pgvector:pg17 and other occurrences for redis,
searxng/searxng, rocirp/zero, nginx); update each service to pull from the
approved production registry (replace image: pgvector/pgvector:pg17 with the
equivalent ghcr.io/... image and do the same for redis, searxng/searxng,
rocirp/zero, nginx) so no service uses non-authorized registries or build
contexts and all image: lines point to the authorized ghcr.io (or other
approved) registry tags.
- Line 56: The docker-compose image entries use an unsafe default tag by falling
back to latest (e.g. image:
ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}); change these
to require or supply a pinned tag instead of defaulting to latest: either remove
the ":-latest" fallback so deployments fail when SURFSENSE_VERSION is unset, or
replace it with a fixed, immutable tag (semantic version or digest) and ensure
your CI/helm/compose pipeline injects SURFSENSE_VERSION; update all occurrences
referencing SURFSENSE_VERSION (and equivalent variables at lines noted) so no
service uses a latest fallback.
- Around line 61-64: The docker-compose service is bind-mounting host files like
"../surfsense_backend/app/config/global_llm_config.yaml" and
"../surfsense_backend/app/services/docling_service.py" into the container which
makes production deployments fragile; replace these bind-mounts by baking the
files into the image (add COPY lines in the relevant Dockerfile to copy
global_llm_config.yaml, document_tasks.py, docling_service.py,
composio_service.py into the container build) and then remove the corresponding
volume entries from docker-compose.yml (also apply the same change for the other
identical host mount entries in the file); alternatively, if these are
runtime-managed configs, switch to a managed volume/secret or config resource
instead of a relative host path.
- Line 20: The compose file currently uses insecure hardcoded fallback values
(e.g., POSTGRES_PASSWORD: ${DB_PASSWORD:-surfsense} and the similar fallbacks
for SEARXNG_SECRET and ZERO_ADMIN_PASSWORD) — change these to hard-fail when the
env var is missing by replacing the default-fallback form (${VAR:-default}) with
the mandatory substitution (${VAR:?error message}) for each variable (e.g.,
POSTGRES_PASSWORD -> ${DB_PASSWORD:?DB_PASSWORD must be set}, SEARXNG_SECRET ->
${SEARXNG_SECRET:?SEARXNG_SECRET must be set}, ZERO_ADMIN_PASSWORD ->
${ZERO_ADMIN_PASSWORD:?ZERO_ADMIN_PASSWORD must be set}) so deployments fail
loudly instead of using insecure defaults.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1368aa19-07c0-401b-8e7c-55a724f675d7

📥 Commits

Reviewing files that changed from the base of the PR and between 7d249e7 and 98fc28a.

📒 Files selected for processing (2)
  • surfsense/docker-compose.yml
  • surfsense_backend/app/services/composio_service.py


services:
db:
image: pgvector/pgvector:pg17
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Use authorized registry images for all production services.

This production Compose file still pulls several images from non-authorized registries (pgvector/pgvector, redis, searxng/searxng, rocicorp/zero, nginx). That violates the production registry policy and should be fixed before release.

As per coding guidelines, "In production Docker Compose files, do not use build contexts or mix build and image; all services must pull images from authorized registries (e.g., use image: ghcr.io/... for every service)."

Also applies to: 31-31, 43-43, 176-176, 224-224

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/docker-compose.yml` at line 14, The Compose file references images
from non-authorized registries (e.g., image: pgvector/pgvector:pg17 and other
occurrences for redis, searxng/searxng, rocirp/zero, nginx); update each service
to pull from the approved production registry (replace image:
pgvector/pgvector:pg17 with the equivalent ghcr.io/... image and do the same for
redis, searxng/searxng, rocirp/zero, nginx) so no service uses non-authorized
registries or build contexts and all image: lines point to the authorized
ghcr.io (or other approved) registry tags.

- ./postgresql.conf:/etc/postgresql/postgresql.conf:ro
environment:
POSTGRES_USER: ${DB_USER:-surfsense}
POSTGRES_PASSWORD: ${DB_PASSWORD:-surfsense}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove insecure default secret fallbacks in production env vars.

Using hardcoded fallback secrets (surfsense, surfsense-searxng-secret, surfsense-zero-admin) makes accidental insecure deployments very likely.

Suggested hard-fail pattern
-      POSTGRES_PASSWORD: ${DB_PASSWORD:-surfsense}
+      POSTGRES_PASSWORD: ${DB_PASSWORD:?DB_PASSWORD is required}

-      SEARXNG_SECRET: ${SEARXNG_SECRET:-surfsense-searxng-secret}
+      SEARXNG_SECRET: ${SEARXNG_SECRET:?SEARXNG_SECRET is required}

-      ZERO_ADMIN_PASSWORD: ${ZERO_ADMIN_PASSWORD:-surfsense-zero-admin}
+      ZERO_ADMIN_PASSWORD: ${ZERO_ADMIN_PASSWORD:?ZERO_ADMIN_PASSWORD is required}

Also applies to: 47-47, 186-186

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/docker-compose.yml` at line 20, The compose file currently uses
insecure hardcoded fallback values (e.g., POSTGRES_PASSWORD:
${DB_PASSWORD:-surfsense} and the similar fallbacks for SEARXNG_SECRET and
ZERO_ADMIN_PASSWORD) — change these to hard-fail when the env var is missing by
replacing the default-fallback form (${VAR:-default}) with the mandatory
substitution (${VAR:?error message}) for each variable (e.g., POSTGRES_PASSWORD
-> ${DB_PASSWORD:?DB_PASSWORD must be set}, SEARXNG_SECRET ->
${SEARXNG_SECRET:?SEARXNG_SECRET must be set}, ZERO_ADMIN_PASSWORD ->
${ZERO_ADMIN_PASSWORD:?ZERO_ADMIN_PASSWORD must be set}) so deployments fail
loudly instead of using insecure defaults.

retries: 5

backend:
image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid latest as the production image fallback tag.

Falling back to latest makes deploys non-reproducible and weakens rollback guarantees (especially with Watchtower enabled).

Suggested tag hardening
-    image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}
+    image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:?SURFSENSE_VERSION is required}
...
-    image: ghcr.io/modsetter/surfsense-web:${SURFSENSE_VERSION:-latest}
+    image: ghcr.io/modsetter/surfsense-web:${SURFSENSE_VERSION:?SURFSENSE_VERSION is required}

Also applies to: 104-104, 136-136, 206-206

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/docker-compose.yml` at line 56, The docker-compose image entries
use an unsafe default tag by falling back to latest (e.g. image:
ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}); change these
to require or supply a pinned tag instead of defaulting to latest: either remove
the ":-latest" fallback so deployments fail when SURFSENSE_VERSION is unset, or
replace it with a fixed, immutable tag (semantic version or digest) and ensure
your CI/helm/compose pipeline injects SURFSENSE_VERSION; update all occurrences
referencing SURFSENSE_VERSION (and equivalent variables at lines noted) so no
service uses a latest fallback.

Comment on lines +61 to +64
- ../surfsense_backend/app/config/global_llm_config.yaml:/app/app/config/global_llm_config.yaml:ro
- ../surfsense_backend/app/tasks/celery_tasks/document_tasks.py:/app/app/tasks/celery_tasks/document_tasks.py:ro
- ../surfsense_backend/app/services/docling_service.py:/app/app/services/docling_service.py:ro
- ../surfsense_backend/app/services/composio_service.py:/app/app/services/composio_service.py:ro
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Host source bind-mounts make this production stack fragile.

Mounting ../surfsense_backend/... files into containers couples runtime behavior to host repo layout and can break clean server deploys. For production, bake these assets into the image (or distribute as managed config artifacts) instead of relying on relative host paths.

Also applies to: 107-110, 138-138

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@surfsense/docker-compose.yml` around lines 61 - 64, The docker-compose
service is bind-mounting host files like
"../surfsense_backend/app/config/global_llm_config.yaml" and
"../surfsense_backend/app/services/docling_service.py" into the container which
makes production deployments fragile; replace these bind-mounts by baking the
files into the image (add COPY lines in the relevant Dockerfile to copy
global_llm_config.yaml, document_tasks.py, docling_service.py,
composio_service.py into the container build) and then remove the corresponding
volume entries from docker-compose.yml (also apply the same change for the other
identical host mount entries in the file); alternatively, if these are
runtime-managed configs, switch to a managed volume/secret or config resource
instead of a relative host path.

@MODSetter
Copy link
Copy Markdown
Owner

@thadavinciminds Thanks for this really appreciate it. Can you please re-raise this PR on dev branch.

@MODSetter MODSetter closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants