Skip to content

feat(gdpr): author erasure (PR5 of #6701)#7550

Open
JohnMcLear wants to merge 8 commits intodevelopfrom
feat-gdpr-author-erasure
Open

feat(gdpr): author erasure (PR5 of #6701)#7550
JohnMcLear wants to merge 8 commits intodevelopfrom
feat-gdpr-author-erasure

Conversation

@JohnMcLear
Copy link
Copy Markdown
Member

Summary

  • New authorManager.anonymizeAuthor(authorID) zeroes the display identity on globalAuthor:<id> (keeps the record as an opaque stub so existing changeset references still resolve), deletes every token2author:* and mapper2author:* binding that points at the author, and nulls authorId on chat messages they posted. Pad content, revisions, and attribute pool are kept intact.
  • New REST endpoint POST /api/1.3.1/anonymizeAuthor?authorID=… — admin-auth via the existing apikey/JWT pipeline.
  • Idempotent. Second call returns zero counters.
  • doc/privacy.md explains exactly what the call does and does not do.

Final PR of the #6701 GDPR work. PR1 #7546 (deletion controls), PR2 #7547 (IP/privacy audit), PR3 #7548 (HttpOnly author cookie), PR4 #7549 (privacy banner) complete the set.

Design: docs/superpowers/specs/2026-04-19-gdpr-pr5-author-erasure-design.md
Plan: docs/superpowers/plans/2026-04-19-gdpr-pr5-author-erasure.md

Test plan

  • ts-check
  • AuthorManager unit (identity zeroing / mapping removal / idempotence / unknown authorID) — 4 passing
  • REST integration (successful erasure + missing-authorID error) — 2 passing
  • api.ts regression — passes

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

Review Summary by Qodo

GDPR Art. 17 author anonymization with REST API endpoint

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Implement GDPR Art. 17 right-to-erasure via anonymizeAuthor(authorID) function
  - Zeroes display identity (name, colorId) on globalAuthor:<id> record
  - Deletes all token2author:* and mapper2author:* bindings pointing to author
  - Nulls authorId on chat messages posted by the author
  - Preserves pad content, revisions, and attribute pools for data integrity
• Add REST endpoint POST /api/1.3.1/anonymizeAuthor?authorID=… with admin JWT/apikey auth
• Implement idempotent erasure with zero-counter returns on subsequent calls
• Add comprehensive unit and integration tests covering identity zeroing, mapping removal,
  idempotence, and error handling
• Document erasure behavior and limitations in doc/privacy.md
Diagram
flowchart LR
  A["Author Request<br/>authorID"] -->|POST /api/1.3.1/anonymizeAuthor| B["API Handler"]
  B -->|validates authorID| C["AuthorManager.anonymizeAuthor"]
  C -->|1. Delete token/mapper bindings| D["token2author:*<br/>mapper2author:*"]
  C -->|2. Zero identity| E["globalAuthor:id<br/>name=null, colorId=0"]
  C -->|3. Null chat authorId| F["pad:id:chat:n<br/>authorId=null"]
  C -->|Returns counters| G["Response<br/>affectedPads, removedMappings"]
  D -.->|Removed| H["DB"]
  E -.->|Updated| H
  F -.->|Updated| H
Loading

Grey Divider

File Changes

1. src/node/db/AuthorManager.ts ✨ Enhancement +87/-0

Add anonymizeAuthor function for GDPR erasure

• Added anonymizeAuthor(authorID) function implementing GDPR Art. 17 erasure
• Zeroes name and colorId on author record while preserving padIDs for maintenance
• Deletes all token2author:* and mapper2author:* bindings pointing to the author
• Iterates through author's pads and nulls authorId on chat messages they posted
• Returns counters for affected pads, removed token/external mappings, and cleared chat messages
• Implements idempotent behavior via erased: true flag to short-circuit on subsequent calls

src/node/db/AuthorManager.ts


2. src/node/db/API.ts ✨ Enhancement +14/-0

Expose anonymizeAuthor on programmatic API

• Exposed anonymizeAuthor as a programmatic API export
• Added validation to ensure authorID parameter is a non-empty string
• Throws CustomError with 'apierror' code if validation fails
• Delegates to authorManager.anonymizeAuthor() for core erasure logic

src/node/db/API.ts


3. src/node/handler/APIHandler.ts ✨ Enhancement +6/-1

Register anonymizeAuthor in API version 1.3.1

• Created new API version 1.3.1 extending 1.3.0 with anonymizeAuthor endpoint
• Added anonymizeAuthor: ['authorID'] parameter specification to version map
• Updated latestApiVersion from '1.3.0' to '1.3.1'

src/node/handler/APIHandler.ts


View more (5)
4. src/tests/backend/specs/anonymizeAuthor.ts 🧪 Tests +74/-0

Add AuthorManager.anonymizeAuthor unit tests

• Created unit test suite for AuthorManager.anonymizeAuthor() with 4 test cases
• Tests identity zeroing: verifies name=null, colorId=0, erased=true flags
• Tests mapping removal: confirms token2author:* and mapper2author:* bindings are deleted
• Tests idempotence: second call returns zero counters for all metrics
• Tests unknown authorID: returns zero counters without errors

src/tests/backend/specs/anonymizeAuthor.ts


5. src/tests/backend/specs/api/anonymizeAuthor.ts 🧪 Tests +51/-0

Add REST anonymizeAuthor integration tests

• Created REST integration test suite for anonymizeAuthor endpoint with 2 test cases
• Tests successful erasure: creates author, calls endpoint, verifies name becomes null
• Tests error handling: missing authorID parameter returns error code 1 with required message
• Uses JWT admin token authentication via common.generateJWTToken()
• Validates response structure includes affectedPads counter

src/tests/backend/specs/api/anonymizeAuthor.ts


6. doc/privacy.md 📝 Documentation +37/-0

Add privacy documentation with erasure section

• Created new privacy documentation file
• Added "Right to erasure (GDPR Art. 17)" section explaining anonymization approach
• Documented what the call does: zeroes identity, deletes bindings, nulls chat authorship
• Documented what it does not do: preserves pad content, revisions, attribute pools
• Provided curl example for operators to trigger erasure via REST API
• Noted idempotent behavior and reference to pad-deletion flow for complete erasure

doc/privacy.md


7. docs/superpowers/specs/2026-04-19-gdpr-pr5-author-erasure-design.md 📝 Documentation +222/-0

Add GDPR PR5 author erasure design specification

• Created comprehensive design specification for GDPR author erasure feature
• Documented audit summary of personal data links in database records
• Specified goals: anonymize identity, delete bindings, keep pad content intact
• Provided detailed pseudocode for anonymizeAuthor implementation
• Outlined REST API integration via version map and OpenAPI auto-generation
• Included testing strategy covering unit, integration, and chat regression scenarios
• Documented risk mitigation and migration considerations

docs/superpowers/specs/2026-04-19-gdpr-pr5-author-erasure-design.md


8. docs/superpowers/plans/2026-04-19-gdpr-pr5-author-erasure.md 📝 Documentation +510/-0

Add GDPR PR5 author erasure implementation plan

• Created detailed implementation plan with 6 sequential tasks
• Task 1: Implement anonymizeAuthor on AuthorManager with full pseudocode
• Task 2: Write unit tests covering identity zeroing, mapping removal, idempotence
• Task 3: Expose on REST API via API.ts and register version 1.3.1
• Task 4: Create REST integration tests with JWT authentication
• Task 5: Document erasure behavior in doc/privacy.md
• Task 6: Verification steps including type-check, full test sweep, and PR creation
• Included self-review checklist mapping spec sections to implementation tasks

docs/superpowers/plans/2026-04-19-gdpr-pr5-author-erasure.md


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

qodo-free-for-open-source-projects bot commented Apr 19, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (1) 📎 Requirement gaps (0)

Grey Divider


Action required

1. anonymizeAuthor lacks feature flag 📘 Rule violation ☼ Reliability
Description
The new anonymizeAuthor REST/API surface is registered unconditionally and becomes available by
default, without any enable/disable mechanism. This violates the requirement that new features be
gated behind a feature flag and disabled by default.
Code

src/node/handler/APIHandler.ts[R146-152]

+version['1.3.1'] = {
+  ...version['1.3.0'],
+  anonymizeAuthor: ['authorID'],
+};
+
// set the latest available API version here
-exports.latestApiVersion = '1.3.0';
+exports.latestApiVersion = '1.3.1';
Evidence
PR Compliance ID 5 requires new features to be behind a feature flag and disabled by default. The PR
registers anonymizeAuthor in the API version map and sets latestApiVersion to 1.3.1 without
any conditional gating, and also exports the new API function directly.

src/node/handler/APIHandler.ts[146-152]
src/node/db/API.ts[65-77]
Best Practice: Repository guidelines

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A new feature (`anonymizeAuthor` API/REST endpoint) is enabled by default and has no feature-flag gating.
## Issue Context
Compliance requires new features to be behind a feature flag and disabled by default.
## Fix Focus Areas
- src/node/handler/APIHandler.ts[146-152]
- src/node/db/API.ts[65-77]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Non-resumable partial erasure🐞 Bug ☼ Reliability
Description
AuthorManager.anonymizeAuthor() persists erased: true before the chat-scrub loop, so any error
during chat scrubbing can leave chat messages unchanged while subsequent calls short-circuit on
existing.erased and never finish the scrub. This contradicts the documented behavior that chat
message authorId is nulled, and makes failures non-recoverable without manual DB intervention.
Code

src/node/db/AuthorManager.ts[R336-395]

+  const existing = await db.get(`globalAuthor:${authorID}`);
+  if (existing == null || existing.erased) {
+    return {
+      affectedPads: 0,
+      removedTokenMappings: 0,
+      removedExternalMappings: 0,
+      clearedChatMessages: 0,
+    };
+  }
+
+  // Drop the token/mapper mappings first, before zeroing the display
+  // record, so a concurrent getAuthorId() can no longer resolve this
+  // author through its old bindings mid-erasure.
+  let removedTokenMappings = 0;
+  const tokenKeys: string[] = await db.findKeys('token2author:*', null);
+  for (const key of tokenKeys) {
+    if (await db.get(key) === authorID) {
+      await db.remove(key);
+      removedTokenMappings++;
+    }
+  }
+  let removedExternalMappings = 0;
+  const mapperKeys: string[] = await db.findKeys('mapper2author:*', null);
+  for (const key of mapperKeys) {
+    if (await db.get(key) === authorID) {
+      await db.remove(key);
+      removedExternalMappings++;
+    }
+  }
+
+  // Zero the display identity. Keep `padIDs` so future maintenance (or a
+  // pad-delete batch) can still find the set of pads this authorID touched.
+  await db.set(`globalAuthor:${authorID}`, {
+    colorId: 0,
+    name: null,
+    timestamp: Date.now(),
+    padIDs: existing.padIDs || {},
+    erased: true,
+    erasedAt: new Date().toISOString(),
+  });
+
+  // Null authorship on chat messages the author posted.
+  const padIDs = Object.keys(existing.padIDs || {});
+  let clearedChatMessages = 0;
+  for (const padID of padIDs) {
+    if (!await padManager.doesPadExist(padID)) continue;
+    const pad = await padManager.getPad(padID);
+    const chatHead = pad.chatHead;
+    if (typeof chatHead !== 'number' || chatHead < 0) continue;
+    for (let i = 0; i <= chatHead; i++) {
+      const chatKey = `pad:${padID}:chat:${i}`;
+      const msg = await db.get(chatKey);
+      if (msg != null && msg.authorId === authorID) {
+        msg.authorId = null;
+        await db.set(chatKey, msg);
+        clearedChatMessages++;
+      }
+    }
+  }
+
Evidence
The function returns immediately if existing.erased is set, but it sets erased: true on the
global author record *before* iterating pads and rewriting chat messages. If an exception occurs
anywhere after the author record update (DB error, pad load error, etc.), retries will short-circuit
and skip the chat scrub step, leaving chat messages with the original authorId despite docs
stating they are nulled.

src/node/db/AuthorManager.ts[336-395]
doc/privacy.md[18-27]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`anonymizeAuthor()` marks the author record as `erased: true` before finishing the chat scrub. If any error occurs during the chat loop, retries will short-circuit on `existing.erased` and never finish nulling `authorId` on chat messages.
### Issue Context
- Current behavior uses `existing.erased` as the idempotency guard.
- Docs state chat message `authorId` is nulled.
- The implementation should either (a) only mark `erased: true` once all steps have completed, or (b) track per-step completion so retries can resume unfinished work.
### Fix Focus Areas
- src/node/db/AuthorManager.ts[336-395]
### Suggested implementation direction
- Introduce an intermediate state (e.g., `erasureInProgress: true`) and set it before starting work.
- Perform token/mapper cleanup + chat scrub.
- Only after successful completion, update the author record to `{erased: true, erasureInProgress: false}`.
- Alternatively: keep `erased: true` but add a separate flag (e.g., `chatScrubbed: true`) and only short-circuit when both are complete; otherwise resume the missing steps.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. O(N) mapping key scans 🐞 Bug ➹ Performance
Description
anonymizeAuthor() deletes mappings by scanning all token2author:* and mapper2author:* keys and
issuing a db.get() for each, which is O(N) over the entire keyspace and can be extremely slow on
large instances. This can cause long-running requests/timeouts during GDPR erasure operations.
Code

src/node/db/AuthorManager.ts[R349-364]

+  let removedTokenMappings = 0;
+  const tokenKeys: string[] = await db.findKeys('token2author:*', null);
+  for (const key of tokenKeys) {
+    if (await db.get(key) === authorID) {
+      await db.remove(key);
+      removedTokenMappings++;
+    }
+  }
+  let removedExternalMappings = 0;
+  const mapperKeys: string[] = await db.findKeys('mapper2author:*', null);
+  for (const key of mapperKeys) {
+    if (await db.get(key) === authorID) {
+      await db.remove(key);
+      removedExternalMappings++;
+    }
+  }
Evidence
Mappings are created via mapAuthorWithDBKey() as individual token2author: / mapper2author:
records, with no reverse index from authorID to tokens/mappers. As a result, anonymization must
enumerate all keys (findKeys('token2author:*'), findKeys('mapper2author:*')) and check each
value to find those that point to the target authorID.

src/node/db/AuthorManager.ts[117-137]
src/node/db/AuthorManager.ts[349-364]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`anonymizeAuthor()` currently performs full keyspace scans over `token2author:*` and `mapper2author:*` and then does per-key `db.get()` checks. On large databases this is slow and can lead to request timeouts.
### Issue Context
Mappings are created one-way (`token2author:<token> -> authorID`, `mapper2author:<mapper> -> authorID`), so there is no efficient way to enumerate mappings for a single author.
### Fix Focus Areas
- src/node/db/AuthorManager.ts[117-137]
- src/node/db/AuthorManager.ts[349-364]
### Suggested implementation direction
- Maintain reverse indexes when creating mappings (e.g., `author2tokens:<authorID>` and `author2mappers:<authorID>` as sets/lists).
- On anonymization, read those reverse-index keys and delete only the relevant `token2author:*` / `mapper2author:*` entries.
- Optionally: keep the scan as a fallback for existing data (migration-free), but prefer the reverse index when present.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +146 to +152
version['1.3.1'] = {
...version['1.3.0'],
anonymizeAuthor: ['authorID'],
};

// set the latest available API version here
exports.latestApiVersion = '1.3.0';
exports.latestApiVersion = '1.3.1';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. anonymizeauthor lacks feature flag 📘 Rule violation ☼ Reliability

The new anonymizeAuthor REST/API surface is registered unconditionally and becomes available by
default, without any enable/disable mechanism. This violates the requirement that new features be
gated behind a feature flag and disabled by default.
Agent Prompt
## Issue description
A new feature (`anonymizeAuthor` API/REST endpoint) is enabled by default and has no feature-flag gating.

## Issue Context
Compliance requires new features to be behind a feature flag and disabled by default.

## Fix Focus Areas
- src/node/handler/APIHandler.ts[146-152]
- src/node/db/API.ts[65-77]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/node/db/AuthorManager.ts
Qodo review: the `erased: true` sentinel was written before the chat
scrub loop, so a throw during scrub left chat messages untouched
while subsequent calls short-circuited on `existing.erased` and never
finished. Split the write: zero the display identity first (still
hides the name), run the chat scrub, and only then stamp
`erased: true` so a retry resumes the sweep. Regression test
covers the partial-run → retry path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant