Skip to content

[rust-compiler] Store unsupported AST nodes typed instead of via serde_json#37

Merged
Boshen merged 3 commits into
mainfrom
rust-compiler/typed-original-node
Jun 17, 2026
Merged

[rust-compiler] Store unsupported AST nodes typed instead of via serde_json#37
Boshen merged 3 commits into
mainfrom
rust-compiler/typed-original-node

Conversation

@Boshen

@Boshen Boshen commented Jun 17, 2026

Copy link
Copy Markdown
Member

Problem

cargo llvm-lines shows the Rust compiler's generated code is dominated by serde's serde::private::de::content::* machinery — the buffering (de)serializer for the #[serde(tag = "type")] + #[serde(flatten)] Babel-AST enums. In the oxc use case (no Babel/JSON boundary), that serde is reached entirely through one internal round-trip: UnsupportedNode.original_node, stored as serde_json::Value, is serialized in lowering (to_value) and re-parsed in codegen (from_value) to re-emit nodes the compiler does not model.

Change

Replace original_node: Option<serde_json::Value> with a typed enum:

pub enum OriginalNode {
    Expression(Box<Expression>),
    Statement(Box<Statement>),
    Pattern(Box<PatternLike>),
}
  • Lowering stores the typed node directly (no to_value).
  • Codegen matches the variant (no from_value). PatternLike::as_expression reproduces the exact from_value::<Expression> coercion for the 7 variants PatternLike shares with Expression.
  • Adds a hir → ast dependency. The field was opaque JSON specifically to keep HIR decoupled from the AST crate; the coupling is already semantic (the field holds a Babel node), and a generic InstructionValue<N> would be far more churn.

Impact (cargo llvm-lines, dev profile)

crate total (before → after) serde (before → after)
react_compiler_reactive_scopes 470,255 → 181,070 248,888 (53%) → 22 (0.0%)
react_compiler_lowering 182,110 → 164,142 38,936 (21%) → 38,900 (24%)

The entire deserialize-side bloat is removed. Lowering's serialize side is unchanged — that is the identifier_loc_index whole-body serialization, a separate follow-up (it carries behavior risk, so it is not bundled here).

Behavior preservation

No behavior change. The old code round-tripped typed → Value → typed, an identity for well-formed nodes, and every original_node is built from a typed node — so the variant already encodes what the type-tag dispatch resolved to. Covered by the rewritten codegen unit tests, the new PatternLike::as_expression tests, and the unknown_statement_lowering integration test.

Verification

  • cargo build --workspace, cargo clippy, and all Rust tests for the touched crates pass.
  • The corpus-wide e2e fixture comparison (yarn snap --rust) was not run locally: the vendored harness assumes the upstream compiler/ layout and react_compiler_* package names, and the snap runner hits a tsWatch TDZ under Node 24. Please let CI run the full gate.

🤖 Generated with Claude Code

…e_json

Replace UnsupportedNode.original_node (Option<serde_json::Value>) with a typed OriginalNode enum so lowering stashes and codegen restores the node without a serde round-trip. This removes the Babel-AST deserializer monomorphization from the binary: react_compiler_reactive_scopes serde codegen drops from ~249K LLVM-IR lines (53% of the crate) to ~0, with no behavior change.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 966ee9c9d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread react-compiler/crates/react_compiler_ast/src/patterns.rs Outdated
@Boshen

Boshen commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Binary size (shipped release index.node — fat LTO + 1 CGU + strip)

Measured the actual stripped release artifact on main vs this branch:

build bytes size
main 6,090,416 5.81 MiB
this PR 5,677,632 5.41 MiB
Δ −412,784 −6.78%

Baseline reproduced byte-for-byte across a clean rebuild, so the delta is solid.

This is the deserialize-side reduction alone. The serialize side (identifier_loc_index's whole-body to_value, ~38.9K IR lines) is still in the binary, so the follow-up that removes it would cut further.

Boshen added 2 commits June 17, 2026 22:38
AssignmentPattern is shared between PatternLike and Expression (the latter carries it for error-recovery positions), so as_expression must convert it rather than fall back to a placeholder — restoring the prior from_value::<Expression> behavior for that tag. Caught in review.
…s sync

Captures the typed OriginalNode change (Rust source under react-compiler/) as a git-apply patch in patches/, which `just patch` re-applies over the freshly-synced upstream tree. Verified it applies cleanly onto main (upstream + codemod + 0001).
@Boshen Boshen merged commit 528a01d into main Jun 17, 2026
@Boshen Boshen deleted the rust-compiler/typed-original-node branch June 17, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant