From 98f30689d51753928998cdb683139cfec947f7b2 Mon Sep 17 00:00:00 2001 From: "Yoshiaki Ueda (bootjp)" Date: Mon, 27 Apr 2026 20:58:17 +0900 Subject: [PATCH 1/2] docs(design): refresh admin dashboard status table for P2 + P4 shipped MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The status table was last updated when the rename to `_partial_` landed (PR #675). Since then: - P2 write paths shipped — #669 (slice 2a, S3 bucket admin endpoints) and #673 (slice 2b, AdminForward integration). The read-only slice was already in via #658. Status flips from 🟡 partial → ✅ shipped. - P4 finished — operator doc in #674, deployment runbook in #669, rolling-update.sh admin wiring in #669 + #678. Status flips from 🟡 mostly → ✅ shipped. Also added the AdminDeleteBucket TOCTOU to Outstanding open items. coderabbitai flagged it during PR #669 review (🔴 / 🟠) — the empty-bucket probe scans `ObjectManifestPrefixForBucket` at readTS but the BucketMetaKey delete carries only that single point key in `ReadKeys`, so a concurrent PutObject inserting a manifest key between readTS and commit will not conflict and the object becomes orphaned. Pre-existing race that `adapter/s3.go:deleteBucket` inherits as well. Tracked here for the future fix (either bump BucketGenerationKey on every PutObject, or add ReadRanges to OperationGroup); the operator-side workaround documented in docs/admin_deployment.md is to pause writes before admin delete. Doc stays at `_partial_` rather than `_implemented_` because two outstanding items remain: AdminForward criterion 5 (rolling- upgrade compatibility flag, explicitly deferred) and the AdminDeleteBucket TOCTOU. Updated the closing sentence accordingly so the rename trigger covers both. No design changes — this is a state refresh. --- docs/design/2026_04_24_partial_admin_dashboard.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/design/2026_04_24_partial_admin_dashboard.md b/docs/design/2026_04_24_partial_admin_dashboard.md index 46e6753d..b160b0ba 100644 --- a/docs/design/2026_04_24_partial_admin_dashboard.md +++ b/docs/design/2026_04_24_partial_admin_dashboard.md @@ -1,26 +1,27 @@ # elastickv Admin Dashboard Design -**Status:** Partial — P1 and P3 have shipped in full; P2 has shipped its read-only slice with the write path still in flight; P4 has shipped TLS / role / CSRF and lands its operator documentation alongside this rename. See the status table below for the per-phase breakdown. +**Status:** Partial — every phase of the original P1–P4 plan has shipped. The doc stays at `_partial_` (rather than `_implemented_`) because AdminForward acceptance criterion 5 (rolling-upgrade compatibility flag) is explicitly deferred and the AdminDeleteBucket TOCTOU caught during PR #669 review is tracked here as a pre-existing limitation. See the status table for the per-phase breakdown and Outstanding open items below. **Author:** bootjp **Date:** 2026-04-24 -**Last updated:** 2026-04-26 (renamed from `_proposed_` to `_partial_` after P1, P3, and the read-only slice of P2 landed) +**Last updated:** 2026-04-27 (P2 write paths + P4 operator doc landed; status table refreshed) -## Implementation status (as of 2026-04-26) +## Implementation status (as of 2026-04-27) | Phase | Status | Landed via | |---|---|---| | **P1** — `internal/admin/` skeleton, auth, DynamoDB list/create/describe/delete, AdminForward (Section 3.3 acceptance criteria 1–4 + 6; criterion 5 deferred — see outstanding items) | ✅ shipped | #634, #635, #644, #648 | -| **P2** — S3 bucket list/create/delete/ACL, DescribeTable | 🟡 partial — read-only slice 1 landed in #658; write paths (slice 2a, #669) and AdminForward integration (slice 2b, #673) are still in flight | +| **P2** — S3 bucket list/create/delete/ACL, DescribeTable | ✅ shipped | #658 (read-only slice 1) + #669 (writes, slice 2a) + #673 (AdminForward integration, slice 2b) | | **P3** — React SPA + embed | ✅ shipped | #649, #650 | -| **P4** — TLS, read-only role, CSRF, `docs/admin.md` | 🟡 mostly shipped — TLS / role / CSRF are live in P1, operator doc in #674 | +| **P4** — TLS, read-only role, CSRF, `docs/admin.md`, deployment runbook + `scripts/rolling-update.sh` admin support | ✅ shipped | TLS / role / CSRF live in P1; operator doc + runbook + script wiring in #674 / #669 / #678 | Outstanding open items (kept here so future readers know what is still owed against the original proposal): - **AdminForward acceptance criterion 5** — rolling-upgrade compatibility flag (`admin.leader_forward_v2`). Deferred behind a cluster-version bump; not blocking dashboard usability today because every node forwards through the same `pb.AdminOperation` enum. +- **AdminDeleteBucket TOCTOU** — coderabbitai 🔴 / 🟠 on PR #669 caught that AdminDeleteBucket scans `ObjectManifestPrefixForBucket` at readTS but the BucketMetaKey delete carries only that single point key in `ReadKeys`; a concurrent PutObject inserting a manifest key in the scanned prefix between readTS and commit will not conflict and the object is orphaned. Pre-existing race — `adapter/s3.go:deleteBucket` (the SigV4 path) inherits the same shape. Closing the gap requires either (a) bumping `BucketGenerationKey` on every PutObject so it can serve as an OCC token in this read set, or (b) extending `OperationGroup` with `ReadRanges` and teaching the FSM to validate range emptiness atomically with commit. Recorded as a code comment on `AdminDeleteBucket` and tracked here for the future fix; the operator-side workaround documented in `docs/admin_deployment.md` is to pause writes against the target bucket before issuing the admin delete. - **S3 object browser** — explicitly called out as "next phase" in Section 2 Non-goals; no work item yet. - **Operator-visible TLS cert reload** — out of scope; restart-to-rotate is the documented model in `docs/admin.md`. -When the rolling-upgrade flag lands, this doc is renamed `2026_04_24_implemented_admin_dashboard.md` per `docs/design/README.md`'s lifecycle convention. +When the rolling-upgrade flag and the TOCTOU are both addressed, this doc is renamed `2026_04_24_implemented_admin_dashboard.md` per `docs/design/README.md`'s lifecycle convention. --- From fb39e3c041cf3f8d911e500b134e284a1688a838 Mon Sep 17 00:00:00 2001 From: "Yoshiaki Ueda (bootjp)" Date: Mon, 27 Apr 2026 21:02:35 +0900 Subject: [PATCH 2/2] Update docs/design/2026_04_24_partial_admin_dashboard.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- docs/design/2026_04_24_partial_admin_dashboard.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/design/2026_04_24_partial_admin_dashboard.md b/docs/design/2026_04_24_partial_admin_dashboard.md index b160b0ba..7015810a 100644 --- a/docs/design/2026_04_24_partial_admin_dashboard.md +++ b/docs/design/2026_04_24_partial_admin_dashboard.md @@ -17,7 +17,7 @@ Outstanding open items (kept here so future readers know what is still owed against the original proposal): - **AdminForward acceptance criterion 5** — rolling-upgrade compatibility flag (`admin.leader_forward_v2`). Deferred behind a cluster-version bump; not blocking dashboard usability today because every node forwards through the same `pb.AdminOperation` enum. -- **AdminDeleteBucket TOCTOU** — coderabbitai 🔴 / 🟠 on PR #669 caught that AdminDeleteBucket scans `ObjectManifestPrefixForBucket` at readTS but the BucketMetaKey delete carries only that single point key in `ReadKeys`; a concurrent PutObject inserting a manifest key in the scanned prefix between readTS and commit will not conflict and the object is orphaned. Pre-existing race — `adapter/s3.go:deleteBucket` (the SigV4 path) inherits the same shape. Closing the gap requires either (a) bumping `BucketGenerationKey` on every PutObject so it can serve as an OCC token in this read set, or (b) extending `OperationGroup` with `ReadRanges` and teaching the FSM to validate range emptiness atomically with commit. Recorded as a code comment on `AdminDeleteBucket` and tracked here for the future fix; the operator-side workaround documented in `docs/admin_deployment.md` is to pause writes against the target bucket before issuing the admin delete. +- AdminDeleteBucket TOCTOU — A race condition exists where AdminDeleteBucket scans ObjectManifestPrefixForBucket at readTS, but the transaction only includes the BucketMetaKey in its read set. A concurrent PutObject inserting a manifest key in the scanned prefix between readTS and commitTS will not trigger a conflict, leading to orphaned objects. This pre-existing race is also present in the SigV4 path (adapter/s3.go:deleteBucket). Potential fixes include (a) using a bucket-level version key as an OCC token (noting the significant performance trade-off for write-heavy buckets), or (b) extending OperationGroup with ReadRanges for atomic range validation at commit time. This is tracked for a future fix; while the current operator-side workaround is to pause writes, the design should investigate mitigation strategies like a temporary proxy or bridge mode to avoid service interruption during this state. - **S3 object browser** — explicitly called out as "next phase" in Section 2 Non-goals; no work item yet. - **Operator-visible TLS cert reload** — out of scope; restart-to-rotate is the documented model in `docs/admin.md`.