feat(reactotron-mcp): expand redaction defaults and add form-urlencoded body support#1608
Merged
Merged
Conversation
…ed body support Bring the MCP redactor closer to the industry consensus denylist used by Sentry, Bugsnag, Postman, Cloudflare, and google/har-sanitizer. Research comparing Charles, Wireshark, Postman, mitmproxy, Proxyman, Chrome DevTools, Sentry, and Datadog is summarized in the PR description. Default rules: - Add CSRF/XSRF and IP-PII header names (x-csrf-token, x-xsrf-token, csrf-token, x-forwarded-for, x-real-ip). - Add common auth/session key variants (token, bearer, jwt, id_token, session, sessionid, csrf, xsrf, passwd, pwd, client_secret). - Add value patterns for Anthropic keys, AWS access key IDs, Google API keys, Stripe live/test/restricted keys, and PEM private key blocks. - Broaden GitHub PAT regex from ghp_ only to gh[pousr]_ (classic, server, OAuth, user-to-server, refresh). Form-urlencoded bodies: - Strings shaped like `k=v&k=v` (no URL prefix) now get the same per-field redaction as URL query parameters. A strict full-match regex prevents false positives on prose containing `=`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joshuayoes
added a commit
that referenced
this pull request
May 4, 2026
…ed body support (#1608) ## Summary Stacks on top of #1607. Expands the MCP redactor's default denylists to match the cross-tool industry consensus and adds per-field redaction for `application/x-www-form-urlencoded` request bodies. Research comparing how other developer tools handle this is below — the short version: the closest analogs (Proxyman MCP, Sentry MCP, GitHub MCP, Postman) all redact at the server boundary by default, and their built-in denylists are broader than what #1607 currently ships. ## Changes ### Default rules — additions **Header names** - CSRF / XSRF variants: `x-csrf-token`, `x-xsrf-token`, `csrf-token` - IP-forwarding PII headers: `x-forwarded-for`, `x-real-ip` **Sensitive keys** - Password aliases: `passwd`, `pwd` - Generic auth-token names: `token`, `bearer`, `jwt`, `id_token`, `idtoken` - Session & CSRF: `session`, `sessionid`, `session_id`, `csrf`, `xsrf`, `csrf_token`, `xsrf_token` - OAuth: `client_secret`, `clientsecret`, `x-api-key` **Value patterns** - Anthropic API keys (`sk-ant-…`) - AWS access key IDs (`AKIA…`) - Google API keys (`AIza…` + 35 chars) - Stripe secret/publishable/restricted keys, live + test (`(?:sk|pk|rk)_(?:test|live)_…`) - PEM-encoded private key blocks (RSA, EC, DSA, OPENSSH, PGP, generic) - GitHub PAT regex broadened from `ghp_` only to `gh[pousr]_` — covers classic, server-to-server, OAuth, user-to-server, and refresh tokens ### Form-urlencoded body redaction A new code path catches strings shaped like `k=v&k=v` with no URL prefix (typical `application/x-www-form-urlencoded` POST bodies). If any key matches `sensitiveKeys`, just that value is redacted — the same semantics already used for URL query params. A strict full-match regex prevents false positives on prose that happens to contain `=`. ### Tests 105 tests passing. New coverage: - Each category of new default rule - Each new value pattern, with test literals constructed at runtime so GitHub secret-scanning doesn't flag the test file - Form-urlencoded body redaction, including negative tests for casual strings and URL-containing strings ### Docs `docs/mcp.md` updated to reflect the expanded default list and call out form-body handling. --- ## Research — how other tools handle this We spawned parallel research on how similar developer tools handle sensitive-data redaction. Full notes kept in the PR discussion; the convergent findings: ### 1. Redact at the server/MCP boundary — unanimous Every closest analog does it at the MCP serialization layer, not in the UI and not in the model: - **Proxyman MCP** — *"Sensitive data (auth tokens, passwords, API keys) is automatically redacted in responses"* ([docs](https://docs.proxyman.com/mcp)) - **Sentry MCP** — inherits Sentry's server-side scrubber - **GitHub MCP** — scans inputs for secrets and blocks by default ([changelog](https://github.blog/changelog/2025-08-13-github-mcp-server-secret-scanning-push-protection-and-more/)) - **Postman Repro** — case-insensitive default-key redaction - **mitmproxy `FilteredDumper` pattern** — redact at display/egress, not on the wire **OWASP MCP Top 10 — MCP01:2025** explicitly mandates: *"redact or sanitize inputs and outputs before logging… redact or mask secrets before writing to logs or telemetry."* ([link](https://owasp.org/www-project-mcp-top-10/2025/MCP01-2025-Token-Mismanagement-and-Secret-Exposure)) ### 2. No `sensitive` / `secretHint` annotation exists in the MCP spec today The 2025-03-26 spec adds `readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint` — but the maintainers are explicit: *"clients MUST NOT rely solely on these for security decisions."* ([MCP blog](https://blog.modelcontextprotocol.io/posts/2026-03-16-tool-annotations/)) Treat server-side redaction as the hard boundary; don't wait for an annotation. ### 3. The de-facto default denylist Union across **Sentry**, **Bugsnag**, **google/har-sanitizer**, **Postman**, **Chrome DevTools sanitized HAR**, **Presidio**: - Headers: `Authorization`, `Cookie`, `Set-Cookie`, `Proxy-Authorization`, `X-Api-Key`, `X-CSRF-Token`, `X-XSRF-Token`, `X-Forwarded-For` - Keys: `password`/`passwd`/`pwd`, `secret`, `token`, `bearer`, `jwt`, `auth`, `authorization`, `api_key`/`apikey`, `credentials`, `session`/`sessionid`, `csrf`/`xsrf`, `access_token`, `refresh_token`, `id_token`, `client_secret`, `private_key` - Value patterns: AWS (`AKIA…`), Google (`AIza…`), JWT (`eyJ…`), Stripe, GitHub PATs (all prefixes), PEM private key blocks, Anthropic (`sk-ant-…`) This PR brings our defaults in line with that union. ### 4. Tool-by-tool highlights | Tool | Redaction approach | What we took / avoided | |---|---|---| | **Charles Proxy** | None built-in; user-written Rewrite rules only | Avoid its "bring your own regex" UX — ship opinionated defaults | | **Wireshark** | `editcap` + third-party TraceWrangler; fail-closed pattern | Noted `strictMode` allowlist as future work | | **Postman** | "Secret" variable type masks UI only; still exfiltrated in analytics URLs — cautionary tale | Redact the fully-rendered payload at MCP boundary, not at display | | **mitmproxy / Proxyman** | `modify_headers`, Python addons; Proxyman MCP auto-redacts but rules are opaque/non-tunable | Keep user-tunable config; don't ship an opaque rule set | | **Chrome DevTools** | `Export HAR (sanitized)` strips `Authorization`, `Cookie`, `Set-Cookie` only (Chrome 130, Oct 2024) | That's the floor. We already go beyond. | | **google/har-sanitizer** | Public [wordlist](https://github.com/google/har-sanitizer/blob/master/harsanitizer/static/wordlist.json) — `state`, `token`, `access_token`, `client_secret`, `SAMLRequest`, etc. | Directly informed our expanded default key list | | **Cloudflare HAR sanitizer** | Conditional, not denylist — strips JWT signature but keeps claims for debugging | Filed as a future enhancement (partial/format-preserving redaction) | | **Sentry / Bugsnag / Datadog / LogRocket** | Opinionated server-side defaults + user-extendable via `beforeSend`-style hook; Datadog offers partial redaction & Luhn-validated card detection | Union of their default lists → our new defaults. Partial redaction & Luhn are follow-ups. | ### Key canonical incident **Okta support breach (Oct 2023)** — attacker stole HAR files from 134 customer support tickets; the HARs contained live session tokens that were used to hijack sessions at BeyondTrust, Cloudflare, and 1Password. The PR's default-on posture is the right response to this class of leak. --- ## What is intentionally NOT in this PR Tracked as follow-ups so the review stays focused: - **Substring matching on keys.** Sentry JS and Bugsnag match substrings; that catches `sessionToken`/`userPassword` automatically but false-positives on `author`/`authored_by` when `auth` is in the list. Would need a separate denylist/pattern split. - **Typed redaction markers** (`[REDACTED:jwt]`) and a `_redacted` summary sibling field. Useful for LLM reasoning and defensive-sandwich logging but changes the public output shape. - **Luhn-validated credit-card detection.** A bare 13–19 digit regex produces too many false positives on random IDs and unix timestamps; needs Luhn to be safe. - **Cookie-value parsing within the `Cookie` header.** Currently the whole header is blunt-redacted. Cloudflare's per-cookie approach (keep names, redact values) would preserve more debug info. - **Partial / format-preserving masking** (keep last 4 of card, keep JWT claims but strip signature) — the strongest idea from Cloudflare/Datadog, worth a dedicated PR. - **`strictMode` allowlist** (à la TraceWrangler's "drop unknown layers" / mitmproxy's `FilteredDumper`) — only forward known-safe headers, redact the rest. ## Test plan - [x] `yarn test` in `lib/reactotron-mcp` — 105 tests pass - [x] `yarn typecheck` clean - [x] `yarn build` succeeds - [ ] Reviewer sanity-check: no new default key is an obvious false-positive trigger for any team's app-specific field names - [ ] Reviewer sanity-check: form-encoded regex doesn't false-positive on real-world payloads in your apps 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacks on top of #1607. Expands the MCP redactor's default denylists to match the cross-tool industry consensus and adds per-field redaction for
application/x-www-form-urlencodedrequest bodies. Research comparing how other developer tools handle this is below — the short version: the closest analogs (Proxyman MCP, Sentry MCP, GitHub MCP, Postman) all redact at the server boundary by default, and their built-in denylists are broader than what #1607 currently ships.Changes
Default rules — additions
Header names
x-csrf-token,x-xsrf-token,csrf-tokenx-forwarded-for,x-real-ipSensitive keys
passwd,pwdtoken,bearer,jwt,id_token,idtokensession,sessionid,session_id,csrf,xsrf,csrf_token,xsrf_tokenclient_secret,clientsecret,x-api-keyValue patterns
sk-ant-…)AKIA…)AIza…+ 35 chars)(?:sk|pk|rk)_(?:test|live)_…)ghp_only togh[pousr]_— covers classic, server-to-server, OAuth, user-to-server, and refresh tokensForm-urlencoded body redaction
A new code path catches strings shaped like
k=v&k=vwith no URL prefix (typicalapplication/x-www-form-urlencodedPOST bodies). If any key matchessensitiveKeys, just that value is redacted — the same semantics already used for URL query params. A strict full-match regex prevents false positives on prose that happens to contain=.Tests
105 tests passing. New coverage:
Docs
docs/mcp.mdupdated to reflect the expanded default list and call out form-body handling.Research — how other tools handle this
We spawned parallel research on how similar developer tools handle sensitive-data redaction. Full notes kept in the PR discussion; the convergent findings:
1. Redact at the server/MCP boundary — unanimous
Every closest analog does it at the MCP serialization layer, not in the UI and not in the model:
FilteredDumperpattern — redact at display/egress, not on the wireOWASP MCP Top 10 — MCP01:2025 explicitly mandates: "redact or sanitize inputs and outputs before logging… redact or mask secrets before writing to logs or telemetry." (link)
2. No
sensitive/secretHintannotation exists in the MCP spec todayThe 2025-03-26 spec adds
readOnlyHint,destructiveHint,idempotentHint,openWorldHint— but the maintainers are explicit: "clients MUST NOT rely solely on these for security decisions." (MCP blog) Treat server-side redaction as the hard boundary; don't wait for an annotation.3. The de-facto default denylist
Union across Sentry, Bugsnag, google/har-sanitizer, Postman, Chrome DevTools sanitized HAR, Presidio:
Authorization,Cookie,Set-Cookie,Proxy-Authorization,X-Api-Key,X-CSRF-Token,X-XSRF-Token,X-Forwarded-Forpassword/passwd/pwd,secret,token,bearer,jwt,auth,authorization,api_key/apikey,credentials,session/sessionid,csrf/xsrf,access_token,refresh_token,id_token,client_secret,private_keyAKIA…), Google (AIza…), JWT (eyJ…), Stripe, GitHub PATs (all prefixes), PEM private key blocks, Anthropic (sk-ant-…)This PR brings our defaults in line with that union.
4. Tool-by-tool highlights
editcap+ third-party TraceWrangler; fail-closed patternstrictModeallowlist as future workmodify_headers, Python addons; Proxyman MCP auto-redacts but rules are opaque/non-tunableExport HAR (sanitized)stripsAuthorization,Cookie,Set-Cookieonly (Chrome 130, Oct 2024)state,token,access_token,client_secret,SAMLRequest, etc.beforeSend-style hook; Datadog offers partial redaction & Luhn-validated card detectionKey canonical incident
Okta support breach (Oct 2023) — attacker stole HAR files from 134 customer support tickets; the HARs contained live session tokens that were used to hijack sessions at BeyondTrust, Cloudflare, and 1Password. The PR's default-on posture is the right response to this class of leak.
What is intentionally NOT in this PR
Tracked as follow-ups so the review stays focused:
sessionToken/userPasswordautomatically but false-positives onauthor/authored_bywhenauthis in the list. Would need a separate denylist/pattern split.[REDACTED:jwt]) and a_redactedsummary sibling field. Useful for LLM reasoning and defensive-sandwich logging but changes the public output shape.Cookieheader. Currently the whole header is blunt-redacted. Cloudflare's per-cookie approach (keep names, redact values) would preserve more debug info.strictModeallowlist (à la TraceWrangler's "drop unknown layers" / mitmproxy'sFilteredDumper) — only forward known-safe headers, redact the rest.Test plan
yarn testinlib/reactotron-mcp— 105 tests passyarn typecheckcleanyarn buildsucceeds🤖 Generated with Claude Code