Skip to content

feat(stark): field-generic constraint IR + single-body ConstraintBuilder framework#763

Closed
MauroToscano wants to merge 6 commits into
mainfrom
feat/single-source-constraints-framework
Closed

feat(stark): field-generic constraint IR + single-body ConstraintBuilder framework#763
MauroToscano wants to merge 6 commits into
mainfrom
feat/single-source-constraints-framework

Conversation

@MauroToscano

Copy link
Copy Markdown
Contributor

What

The framework half of the single-source constraints redesign: every transition constraint gets one definition that serves three consumers — compiled prover evaluation, compiled verifier evaluation (identical code path for the recursion guest), and a captured flat IR for the upcoming GPU constraint interpreter.

This PR is inert: nothing here is wired into production paths, and prover/verifier behavior is byte-identical to main. It adds:

  • crypto/stark/src/constraint_ir/ — a field-generic constraint IR (ConstraintProgram<F, E>: flat op-DAG with side-table constants), its builder (hash-consing CSE), and a CPU interpreter. This is the artifact the GPU kernel will consume (data residency: evaluate constraints on-device instead of shipping the LDE back to the CPU).
  • crypto/stark/src/constraints/builder.rs — the ConstraintBuilder trait (single-body authoring surface: leaves + operator-overloaded expressions + indexed emit), ConstraintSet, ConstraintMeta, and its three implementations: ProverEvalFolder (compiled per-row eval), VerifierEvalFolder (compiled OOD eval), CaptureBuilder (expression tree → flattened IR, with per-constraint measured degrees).
  • crypto/stark/src/constraints/zerofier.rs — the existing zerofier evaluation logic as free functions of ConstraintMeta (bodies relocated verbatim; equivalence-tested against the still-live trait defaults).
  • Design docs: thoughts/gpu-constraint-eval/impl-plan-single-source-constraints.md (the plan this implements, including the decision record) and survey-constraint-frontends.md (how Plonky3 / OpenVM / SP1 / risc0 / zisk / airbender solve the same problem — the design follows the production-validated pattern).

Why

Constraint definitions were about to be duplicated (a hand-written evaluate plus a hand-written GPU capture per constraint). One body with multiple interpretations removes that permanently. The CPU paths stay compiled (measured: an interpreted CPU path costs ~9% total prove time on ethrex — recorded in the plan), and the interpreter/IR exists for the GPU and as a differential-testing reference.

Testing

  • 31 new tests, 168/168 total in cargo test -p stark: hand-built IR programs vs direct field arithmetic (every op/leaf kind), three-way differentials on random rows (prover folder vs verifier folder vs captured-and-interpreted program), degree measurement, emission-completeness and meta-invariant assertions, zerofier free functions vs the untouched trait defaults across all six branch shapes, and a non-Goldilocks field tower to keep the genericity honest.
  • cargo fmt / cargo clippy clean; no unsafe anywhere in the new code.

Next

The follow-up PR converts all tables + the LogUp framework constraints to single bodies, rewires AirWithBuses/AIR/zerofier grouping onto this framework, and deletes the old per-constraint machinery — gated on byte-identical golden proofs for a fixed ELF set.

New module crypto/stark/src/constraint_ir/ providing a flat, topologically
ordered IR for transition constraints, generic over a field tower
<F: IsSubFieldOf<E>, E> (defaulting to Goldilocks and its degree-3
extension):

- ir.rs: Op/Dim/ConstraintProgram<F, E>. Constants live in base_consts/
  ext_consts side tables referenced by index (Op::ConstBase(u32)/
  Op::ConstExt(u32)), keeping Op a plain Copy+Eq+Hash payload with zero
  bounds on the fields.
- builder.rs: IrBuilder<F, E> with (Op, Dim) hash-consing, by-value
  constant dedup via linear scan (FieldElement's canonicalizing
  PartialEq), and the id-0 = base-zero convention.
- interp.rs: forward-pass interpreter; eval_program /
  eval_program_verifier match the AIR compute_transition_prover /
  compute_transition contracts, eval_program_base is the minimal
  single-root entry point. Const ops read the side tables directly.

Not wired into the prover or verifier; no behavior change.
Hand-built IrBuilder programs checked against direct FieldElement
arithmetic: every Op variant and leaf kind, mixed base/ext arithmetic
with auto-embed, constant dedup (base, signed, ext), the id-0 zero
convention, CSE sharing, out-of-order emit() root indexing, the
complete-flag plumbing, and 1000-row randomized differential checks.

The prover and verifier entry points are exercised against
hand-constructed TransitionEvaluationContext values (both variants),
including the base-root promotion on the verifier side and next-row
(offset 1) frame reads.

A reflexive-tower test (E = F over the u32 test field, which has a
different modulus and BaseType than Goldilocks) proves the module is
genuinely field-generic; math's test-utils feature is enabled as a
dev-dependency for it.
New module crypto/stark/src/constraints/builder.rs: one constraint body,
written once against the ConstraintBuilder trait, is interpreted three
ways depending on the implementation it runs over:

- ProverEvalFolder<F, E> (Expr = FieldElement<F>): compiled per-row
  prover evaluation, constructed from the Prover
  TransitionEvaluationContext variant plus output slices; emit_ext
  writes ext_evals at the absolute constraint index.
- VerifierEvalFolder<F, E> (Expr = FieldElement<E>): the same body at
  the OOD point over the all-extension frame; const_base embeds via
  FieldElement::<F>::from(v).to_extension(). Monomorphized into the
  guest binary this is the recursion path — no capture, no hashing,
  no interpretation.
- CaptureBuilder<F, E> (Expr = owned Rc expression tree with eager
  dim + degree): one setup-time run that flattens into the constraint
  IR's IrBuilder (hash-consing there = structural CSE) and returns the
  program plus per-root tree-measured degrees.

Also: ExprOps/ExtExprOps operator-bound aliases (mixed base/ext ops
keep the base operand on the left, matching the field tower),
ConstraintMeta + RootKind plain-data constraint metadata with the
dense/idx-ordered/Base-prefix invariants (num_base_from_meta), the
ConstraintSet per-table trait, and debug-build emit tracking asserting
every constraint index is emitted exactly once.

Tests: a sample ConstraintSet (EqXor-, IsBit- and Add-carry-pair-shaped
bodies plus a LogUp-shaped extension constraint) checked on 1000 random
rows three ways — prover folder vs direct arithmetic, prover folder vs
interpreted capture, verifier folder vs interpreted capture — plus
measured-vs-declared degrees, meta invariants, and the completeness
asserts.

Not wired into any production path; no behavior change.
New module crypto/stark/src/constraints/zerofier.rs: the bodies of the
TransitionConstraintEvaluator default methods (end_exemptions_roots,
end_exemptions_lde_evaluations, zerofier_evaluations_on_extended_domain,
evaluate_zerofier) relocated verbatim to free functions consuming plain
ConstraintMeta — they only ever read the metadata getters. The trait
defaults remain untouched and stay the production path until tables
convert to ConstraintSet.

Tests assert the free functions match the trait defaults bit-for-bit on
a configurable boxed constraint across every branch: the default shape,
end exemptions, period/offset combinations, and periodic exemptions with
and without end exemptions, over both the LDE-domain and OOD entry
points.
@MauroToscano

Copy link
Copy Markdown
Contributor Author

/bench-verify

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Verifier benchmark started on the bench server (~4 min). The bench server is occupied until it finishes.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Verifier benchmark — a4038a4939 vs main (20 pairs)

=== Verify ABBA result (improvement: + = PR faster) ===

Metric main PR Δ (paired)
Verify time 4.288s 4.297s -0.21% ⚪
  pairs: 20   mean A (PR): 4.297s   mean B (main): 4.288s
  [parametric] paired-t   mean -0.21%   sd 1.11%   se 0.25%
               95% CI: [-0.73%, +0.30%]   (t df=19 = 2.093)
  [robust]     median +0.04%   Wilcoxon W+=93 W-=97  p(exact)=0.9530  (z=-0.06)

  run-to-run jitter:    A CV 1.11%   B CV 0.50%        (lower = steadier)
  within-session drift: -0.90% over the run, 1st->2nd half -0.56%

INCONCLUSIVE — effect not separable from 0 at n=20 (point estimate ~+0.04%). Add pairs to resolve.

Drift-free interleaved A/B/B/A measurement. + = PR faster. Trust the verdict when paired-t and Wilcoxon agree.

@MauroToscano

Copy link
Copy Markdown
Contributor Author

We will just merge the version with everything: #764

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant