Skip to content

jlexer: reject raw control characters in string literals#441

Open
omkhar wants to merge 1 commit into
mailru:masterfrom
omkhar:omkhar/reject-raw-control-chars
Open

jlexer: reject raw control characters in string literals#441
omkhar wants to merge 1 commit into
mailru:masterfrom
omkhar:omkhar/reject-raw-control-chars

Conversation

@omkhar

@omkhar omkhar commented Jun 27, 2026

Copy link
Copy Markdown

Problem

jlexer accepts raw, unescaped control characters (bytes 0x000x1F, including raw TAB / newline / NUL) inside JSON string literals, in both string values and object keys. encoding/json rejects these per RFC 8259 §7 ("invalid character … in string literal").

This is a parser-differential / JSON-interoperability issue (same class as #375): a strict JSON validator placed in front of an easyjson consumer rejects a payload carrying an embedded raw newline / control byte, while easyjson accepts it and decodes the control byte into the Go string — a smuggling primitive (e.g. log/record injection, content-filter evasion).

Reproduce (current master): {"str":"<0x09>"} (raw tab) decodes with a nil error into "\t"; encoding/json rejects the same bytes.

Fix

Reject raw bytes < 0x20 outside an escape in the string scanner (findStringLen/fetchString). Escaped sequences (\t, \n, ) remain accepted and decode normally, matching encoding/json (which also accepts escaped control chars but rejects raw ones).

  • jlexer/lexer.go: +22/-5 (one helper + a third named return on the existing scanner; single pass, no extra allocation).
  • go test ./... passes; no existing fixture relied on raw-control-char acceptance.

Open question on rollout: this tightens default parsing. If preserving maximum leniency by default is preferred, the check could instead be gated behind an opt-in generator flag in the spirit of -disallow_unknown_fields and the proposed -disallow_duplicate_fields (#375). Happy to rework it that way — flagging the trade-off rather than assuming. See also #72, #309 for prior validation-strictness work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant