feat: add security hardening and monitoring updates#4708
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f89a8a8ac4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Что: - Усилил authorization boundaries в router/service/WSS слоях: service access, assigned-server access, organization scoping, permission checks и safer deploy-source ownership checks. - Добавил переиспользуемые access helpers для Docker/local server, monitoring targets, deploy-source credentials, destination binding, backup/schedule targets и provider ownership. - Усилил command/path safety: shell argument quoting, Git clone/build commands, custom compose command validation, Dockerfile/build command boundaries, bind/restore path normalization и SSRF-aware URL checks. - Добавил secret redaction для API responses, nested service reads, notification/error paths, Git providers, SSH keys, registry data, deployment/rollback context и logs. - Закрыл Git provider/webhook gaps: GitHub App OAuth state signing, provider read/update access checks, webhook payload binding и safer custom Git URL handling. - Закрепил schedule/deployment worker jobs signed metadata: scoped queue payloads, immutable job claims, stale/foreign job rejection и explicit package exports for signed schedule/deployment helpers. - Улучшил monitoring UX: resource breakdown by container, process parsing, swap reporting, Docker Disk Usage details, image/container/volume context, visible detail limits и responsive layout. - Добавил safe application env upsert: dry-run metadata, revision checks, secret-aware result summaries и conflict protection. - Обновил DB/schema/tooling surface для SSO/Gitea scoping, refreshed dependencies, TypeScript/Next build path и server package export switch scripts with stable final newline output. - После Codex Review restored historical migration `0169` to the upstream form and moved host-level schedule cleanup into new forward-only migration `0175_guard_host_schedule_scope`. - Добавил и актуализировал regression coverage across auth, backup, deploy, Docker, Git providers, monitoring, schedule/deployment signing, permissions, redaction, safe paths, webhooks, package exports and migration placement. Зачем: - Для владельца Dokploy это снижает риск IDOR, cross-server access, SSRF, command injection, unsafe path access, leaked secrets и stale worker job execution. - Для maintainer review это превращает большой набор security/UX gaps в проверяемые helpers, negative fixtures, focused tests и повторяемые package/build boundaries. - Для операторов Monitoring становится понятнее: видно, какие containers/images/volumes/processes используют ресурсы, а Docker Disk Usage раскрывается без потери контекста. - Для integrations safe env upsert дает менее рискованный путь точечно обновлять application environment variables без silent overwrite. - Для PR Dokploy#4708 review comments это закрывает runtime package-export gap for `@dokploy/server/utils/deployments/signed-job` and keeps already-applied migrations immutable. Риски: - PR commit остается большим и squashed; maintainer может попросить разделить security, monitoring и dependency/tooling parts на несколько PR. - Dependency/tooling refresh широкий, поэтому upstream CI после push остается финальным source of truth. - Migration `0175_guard_host_schedule_scope` fail-closes host-level schedules for organizations touched by ambiguous multi-org owners, because exact original `schedule.userId` provenance is no longer recoverable after historical `0169`; affected users may need to re-enable reviewed schedules. - Полный локальный `pnpm test` after review fixes failed only in real deploy tests under `__test__/deploy/application.real.test.ts`; 151 files passed and 1328 tests passed, but 4 shell clone/build deploy cases failed through local `execAsync`. - Browser smoke был локальным unauthenticated boot/render smoke: login page, `/` HTTP 200 и screenshot; authenticated dashboard flows локально не проверялись. - Локальная среда использовала Node v24.18.0 при `.nvmrc` 24.4.0; engine range допускает 24.x, но это отличается от exact project recommendation. Проверки: - Команды и результаты: `git diff --exit-code upstream/canary -- apps/dokploy/drizzle/0169_parched_johnny_storm.sql` passed; `cmp -s apps/dokploy/drizzle/meta/0174_snapshot.json apps/dokploy/drizzle/meta/0175_snapshot.json` passed; `jq` checks for `_journal.json` and `0175_snapshot.json` passed; `CI=true corepack pnpm --dir apps/dokploy exec vitest --config __test__/vitest.config.ts run __test__/deployment/signed-job-scope.test.ts __test__/security/high-findings-regression.test.ts` passed twice, final run 2 files / 16 tests; `git diff --check` and `git diff --cached --check` passed; `CI=true corepack pnpm exec biome check apps/dokploy/__test__/deployment/signed-job-scope.test.ts apps/dokploy/__test__/security/high-findings-regression.test.ts packages/server/scripts/switchToDist.js packages/server/scripts/switchToSrc.js packages/server/package.json` passed; `CI=true corepack pnpm run format-and-lint` passed with existing 37 warnings and 26 infos; `CI=true corepack pnpm run typecheck` passed for server, dokploy, api and schedules; `CI=true corepack pnpm run server:build` passed; `test -f packages/server/dist/utils/deployments/signed-job.js` passed; `corepack pnpm run server:script` restored source exports; `CI=true corepack pnpm run build` passed with build-only `BETTER_AUTH_SECRET` warnings; `corepack pnpm run server:script` restored source exports after build; `CI=true corepack pnpm run test` failed only in `__test__/deploy/application.real.test.ts`, with 151 files passed / 4 failed real deploy tests / 1328 tests passed / 1 skipped; independent Agent Flow reviewer.qa returned pass-with-risks with no blocker for amend/push. - Ограничения: upstream CI, maintainer review, authenticated browser flows, production deployment and a real DB migration fixture for `0175` were not run locally. What: - Strengthened authorization boundaries across router/service/WSS layers: service access, assigned-server access, organization scoping, permission checks, and safer deploy-source ownership checks. - Added reusable access helpers for Docker/local server, monitoring targets, deploy-source credentials, destination binding, backup/schedule targets, and provider ownership. - Hardened command/path safety: shell argument quoting, Git clone/build commands, custom compose command validation, Dockerfile/build command boundaries, bind/restore path normalization, and SSRF-aware URL checks. - Added secret redaction for API responses, nested service reads, notification/error paths, Git providers, SSH keys, registry data, deployment/rollback context, and logs. - Closed Git provider/webhook gaps: GitHub App OAuth state signing, provider read/update access checks, webhook payload binding, and safer custom Git URL handling. - Bound schedule/deployment worker jobs with signed metadata: scoped queue payloads, immutable job claims, stale/foreign job rejection, and explicit package exports for signed schedule/deployment helpers. - Improved monitoring UX: resource breakdown by container, process parsing, swap reporting, Docker Disk Usage details, image/container/volume context, visible detail limits, and responsive layout. - Added safe application env upsert with dry-run metadata, revision checks, secret-aware result summaries, and conflict protection. - Updated DB/schema/tooling surface for SSO/Gitea scoping, refreshed dependencies, TypeScript/Next build path, and server package export switch scripts with stable final-newline output. - After Codex Review, restored historical migration `0169` to the upstream form and moved host-level schedule cleanup into the new forward-only migration `0175_guard_host_schedule_scope`. - Added and updated regression coverage across auth, backup, deploy, Docker, Git providers, monitoring, schedule/deployment signing, permissions, redaction, safe paths, webhooks, package exports, and migration placement. Why: - For the Dokploy owner, this lowers the risk of IDOR, cross-server access, SSRF, command injection, unsafe path access, leaked secrets, and stale worker job execution. - For maintainer review, this turns a large set of security and UX gaps into reviewable helpers, negative fixtures, focused tests, and repeatable package/build boundaries. - For operators, Monitoring becomes clearer: containers/images/volumes/processes show resource ownership, and Docker Disk Usage can expand without losing context. - For integrations, safe env upsert provides a lower-risk way to patch application environment variables without silent overwrite. - For PR Dokploy#4708 review comments, this closes the runtime package-export gap for `@dokploy/server/utils/deployments/signed-job` and keeps already-applied migrations immutable. Risks: - The PR commit remains large and squashed; the maintainer may ask to split security, monitoring, and dependency/tooling parts into separate PRs. - The dependency/tooling refresh is broad, so upstream CI after push remains the final source of truth. - Migration `0175_guard_host_schedule_scope` fail-closes host-level schedules for organizations touched by ambiguous multi-org owners because exact original `schedule.userId` provenance is no longer recoverable after historical `0169`; affected users may need to re-enable reviewed schedules. - Full local `pnpm test` after review fixes failed only in real deploy tests under `__test__/deploy/application.real.test.ts`; 151 files passed and 1328 tests passed, but 4 shell clone/build deploy cases failed through local `execAsync`. - Browser smoke was a local unauthenticated boot/render smoke: login page, `/` HTTP 200, and screenshot; authenticated dashboard flows were not checked locally. - The local environment used Node v24.18.0 while `.nvmrc` is 24.4.0; the engine range accepts 24.x, but it differs from the exact project recommendation. Checks: - Commands and results: `git diff --exit-code upstream/canary -- apps/dokploy/drizzle/0169_parched_johnny_storm.sql` passed; `cmp -s apps/dokploy/drizzle/meta/0174_snapshot.json apps/dokploy/drizzle/meta/0175_snapshot.json` passed; `jq` checks for `_journal.json` and `0175_snapshot.json` passed; `CI=true corepack pnpm --dir apps/dokploy exec vitest --config __test__/vitest.config.ts run __test__/deployment/signed-job-scope.test.ts __test__/security/high-findings-regression.test.ts` passed twice, final run 2 files / 16 tests; `git diff --check` and `git diff --cached --check` passed; `CI=true corepack pnpm exec biome check apps/dokploy/__test__/deployment/signed-job-scope.test.ts apps/dokploy/__test__/security/high-findings-regression.test.ts packages/server/scripts/switchToDist.js packages/server/scripts/switchToSrc.js packages/server/package.json` passed; `CI=true corepack pnpm run format-and-lint` passed with existing 37 warnings and 26 infos; `CI=true corepack pnpm run typecheck` passed for server, dokploy, api, and schedules; `CI=true corepack pnpm run server:build` passed; `test -f packages/server/dist/utils/deployments/signed-job.js` passed; `corepack pnpm run server:script` restored source exports; `CI=true corepack pnpm run build` passed with build-only `BETTER_AUTH_SECRET` warnings; `corepack pnpm run server:script` restored source exports after build; `CI=true corepack pnpm run test` failed only in `__test__/deploy/application.real.test.ts`, with 151 files passed / 4 failed real deploy tests / 1328 tests passed / 1 skipped; independent Agent Flow reviewer.qa returned pass-with-risks with no blocker for amend/push. - Limitations: upstream CI, maintainer review, authenticated browser flows, production deployment, and a real DB migration fixture for `0175` were not run locally.
|
@codex review |
|
Local verification update for
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 635a8048ba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ...redactDeployableServiceSecrets( | ||
| redactGitProviderSecrets(application), | ||
| ), |
There was a problem hiding this comment.
Preserve application secrets before returning edit data
When an existing application has env, build args/secrets, or Docker registry credentials, this endpoint now returns redacted placeholders through redactDeployableServiceSecrets. The application Environment and provider forms reset from data.env, data.buildArgs, data.buildSecrets, data.password, etc. and the existing save mutations write those values back verbatim, so opening an edit tab and saving any change replaces the real deployment secrets with __DOKPLOY_REDACTED_SECRET__, breaking subsequent builds/deploys. Either avoid redacting fields consumed by edit forms or preserve the placeholder values on the save path.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
| return backup; | ||
| return redactBackupScheduleSecrets(backup); |
There was a problem hiding this comment.
Preserve backup metadata secrets on schedule edits
For existing compose/database backup schedules, redactBackupScheduleSecrets replaces metadata credentials such as MariaDB/Mongo passwords and MySQL root passwords before the edit dialog receives them. handle-backup.tsx resets the form from backup.metadata and update persists data.metadata directly, so saving a schedule-only change stores the redaction sentinel as the backup credential and causes later backups/restores to use the wrong password. The update path needs the same redacted-value preservation used for other secret-bearing settings.
Useful? React with 👍 / 👎.
What is this PR about?
This PR collects security hardening, monitoring UX, safe env update, and regression coverage work into one upstream review branch for
Dokploy/dokploy:canary.Fresh readiness pass:
upstream/canary8b6481501e6e379b9ce32c4da4201fcb7a65364af89a8a8ac446993b7d55b1d1327eba36ce5177b1canaryoutput/playwright/agenthits-pr-ready-smoke/home-1440x900.pngChecklist
Before submitting this PR, please make sure that:
canarybranch.Issues related (if applicable)
Not linked.
Screenshots (if applicable)
No generated screenshot files are added to this upstream diff. A local readable browser-smoke screenshot was captured at
output/playwright/agenthits-pr-ready-smoke/home-1440x900.pngand shows the Dokploy sign-in page rendered from the built app.Detailed Changes
Что:
Зачем:
Проверки:
git diff --check $(git merge-base upstream/canary HEAD)..HEADpassed.CI=true corepack pnpm run format-and-lintpassed with existing 37 warnings and 26 infos.CI=true corepack pnpm run typecheckpassed for server, dokploy, api and schedules.CI=true corepack pnpm run server:buildpassed;corepack pnpm run server:scriptrestored source exports.WasmHash._updateWithBuffer; after movingapps/dokploy/.nextto/tmp/dokploy-next-cache-rebase-20260630095440,CI=true corepack pnpm --dir apps/dokploy run build-nextpassed and fullCI=true corepack pnpm run buildpassed.CI=true corepack pnpm run testpassed: 152 files / 1331 tests passed / 1 skipped.HEAD /andGET /200, and saved a readable 1440x900 login-page screenshot.Риски:
.nvmrcis 24.4.0; the project engine range accepts 24.x, but this differs from the exact project recommendation.What:
Why:
Review follow-up
Addressed Codex Review P1/P2 in
635a8048b:@dokploy/server/utils/deployments/signed-jobpackage export and kept source/dist switch scripts in sync;0169to the upstream form and moved corrective host schedule cleanup into forward-only migration0175_guard_host_schedule_scope.Migration note:
0175intentionally fail-closes ambiguousdokploy-serverschedules by disabling schedules for organizations touched by multi-org owners, plus null organization scopes. After0169dropsschedule.userId, exact original schedule provenance is not recoverable, so affected operators may need to re-enable reviewed schedules.Latest local check note: after starting the existing Docker Desktop runtime and using
/Applications/Docker.app/Contents/Resources/bininPATH, the focused real deploy suite passed (application.real.test.ts: 5 passed / 1 skipped), and fullCI=true corepack pnpm run testpassed (152 files,1332 tests,1 skipped).