Skip to content

Add Core-to-Core datatype partial evaluation transform with correctness proof#985

Open
tautschnig wants to merge 3 commits intomainfrom
tautschnig/core-transforms-goto-pipeline
Open

Add Core-to-Core datatype partial evaluation transform with correctness proof#985
tautschnig wants to merge 3 commits intomainfrom
tautschnig/core-transforms-goto-pipeline

Conversation

@tautschnig
Copy link
Copy Markdown
Contributor

Add a general Core-to-Core transform that simplifies datatype tester and selector applications on known constructor terms:

  • tester(C(args)) → true/false depending on whether tester matches C
  • selector_i(C(args)) → args[i] when selector matches C

Components:

  • DatatypeInfo + collectDatatypeInfo: collect constructor, tester, and selector metadata from Core datatype declarations
  • matchConstrApp: decompose an expression into constructor + arguments
  • partialEvalDatatypesCore: recursive expression simplifier
  • partialEvalDatatypesInProgram: apply across all procedure bodies, axioms, and specifications

Correctness proof (zero sorry):

  • Define DtEquiv inductive relation capturing datatype axiom equivalence (reflexivity, symmetry, transitivity, congruence, tester/selector axioms)
  • Prove partialEvalDatatypesCore_correct by structural induction on LExpr: all rewrite cases justified by DtEquiv axioms applied to simplified subexpressions, composed with congruence and transitivity

Wired into the GOTO pipeline after procedure inlining. Includes unit tests for collectDatatypeInfo, tester simplification (positive and negative), and selector projection.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…ss proof

Add a general Core-to-Core transform that simplifies datatype tester and
selector applications on known constructor terms:
- tester(C(args)) → true/false depending on whether tester matches C
- selector_i(C(args)) → args[i] when selector matches C

Components:
- DatatypeInfo + collectDatatypeInfo: collect constructor, tester, and
  selector metadata from Core datatype declarations
- matchConstrApp: decompose an expression into constructor + arguments
- partialEvalDatatypesCore: recursive expression simplifier
- partialEvalDatatypesInProgram: apply across all procedure bodies,
  axioms, and specifications

Correctness proof (zero sorry):
- Define DtEquiv inductive relation capturing datatype axiom equivalence
  (reflexivity, symmetry, transitivity, congruence, tester/selector axioms)
- Prove partialEvalDatatypesCore_correct by structural induction on LExpr:
  all rewrite cases justified by DtEquiv axioms applied to simplified
  subexpressions, composed with congruence and transitivity

Wired into the GOTO pipeline after procedure inlining.
Includes unit tests for collectDatatypeInfo, tester simplification
(positive and negative), and selector projection.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig self-assigned this Apr 20, 2026
@tautschnig tautschnig requested review from a team and Copilot April 20, 2026 20:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Core-to-Core transformation that partially evaluates datatype tester/selector applications when applied to known constructor terms, proves the rewrite relation correct, and integrates the pass into the CBMC GOTO pipeline.

Changes:

  • Implement DatatypeInfo collection plus a recursive Core expression simplifier for datatype tester/selector applications.
  • Add a correctness development (DtEquiv + partialEvalDatatypesCore_correct) with no sorry.
  • Wire the pass into inlineCoreToGotoFiles and add unit tests covering metadata collection and key simplifications.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
Strata/Transform/DatatypePartialEval.lean New datatype partial-evaluation transform and program-wide traversal.
Strata/Transform/DatatypePartialEvalCorrect.lean Correctness proof for the Core expression simplifier via DtEquiv.
Strata/Backends/CBMC/GOTO/CoreToGOTOPipeline.lean Integrates the transform into the Core→GOTO pipeline after inlining.
StrataTest/Transform/DatatypePartialEval.lean Unit tests for datatype info collection and tester/selector simplification.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Strata/Transform/DatatypePartialEval.lean Outdated
Comment thread Strata/Transform/DatatypePartialEval.lean Outdated
Comment thread Strata/Transform/DatatypePartialEval.lean Outdated
Comment thread Strata/Transform/DatatypePartialEval.lean Outdated
Comment thread Strata/Transform/DatatypePartialEval.lean Outdated
Comment thread Strata/Backends/CBMC/GOTO/CoreToGOTOPipeline.lean Outdated
@joscoh
Copy link
Copy Markdown
Contributor

joscoh commented Apr 20, 2026

I am a bit confused about this transformation's purpose. These simplifications are already part of the Lambda (and hence Core) partial evaluator.

tautschnig and others added 2 commits April 22, 2026 15:33
Implementation fixes:
- Fix 1: Remove stale eq(C(args1),C(args2)) bullet from module docstring
- Fix 2: O(n) matchConstrApp.collect using accumulator instead of O(n²) append
- Fix 3: Update matchConstrApp docstring (not 'fully-applied', note no arity check)
- Fix 4: Recurse into op in binary application case (was missing)
- Fix 5: Narrow partialEvalDatatypesInProgram docstring to actual scope
- Fix 6: Rename shadowed variable in GOTO pipeline

Refactoring:
- Extract trySimplifyUnaryApp helper from partialEvalDatatypesCore for
  modularity and proof tractability

Proof rewrite (DatatypePartialEvalCorrect.lean):
- Add missing abs/eq cases for new LExpr constructors from main merge
- Use import all + unfold instead of simp [partialEvalDatatypesCore]
  (definition is opaque across module boundaries with the module keyword)
- Restructure around trySimplifyUnaryApp helper for modularity
- Two sorry's remain: trySimplifyUnaryApp_correct and the app case
  of partialEvalDatatypesCore_correct. The proof structure is validated
  but filling these requires careful case-splitting on the tester/selector
  lookups inside trySimplifyUnaryApp.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Move DatatypeInfo structure and ofDatatypes builder from the Core-specific
DatatypePartialEval transform to Lambda.TypeFactory, making it available
to both the standalone Core-to-Core transform and the Lambda evaluator.

- DatatypeInfo now lives in Strata.DL.Lambda.TypeFactory
- DatatypeInfo.ofDatatypes builds from List (LDatatype IDMeta)
- matchConstrApp now delegates to Lambda.getLFuncCall instead of a
  custom collect function, eliminating code duplication
- collectDatatypeInfo is a thin wrapper extracting datatypes from
  Core program declarations and calling ofDatatypes
- Removed constrToTester and constrSiblingTesters fields (unused by
  the transform; testerToConstr suffices)

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig
Copy link
Copy Markdown
Contributor Author

tautschnig commented Apr 22, 2026

I am a bit confused about this transformation's purpose. These simplifications are already part of the Lambda (and hence Core) partial evaluator.

My primary intended use for this transform is the GOTO pipeline: The Lambda evaluator handles these simplifications during symbolic evaluation via inlineIfConstr and concreteEval. The GOTO pipeline, however, translates Core to CProver GOTO without running the Lambda evaluator — after procedure inlining, tester/selector applications on known constructors remain unsimplified.

To address the duplication concern, I've factored out the shared piece in 6ca07fc: DatatypeInfo (the constructor/tester/selector name mappings) now lives in Lambda.TypeFactory with a DatatypeInfo.ofDatatypes builder that works on List (LDatatype IDMeta). The standalone transform uses this shared structure, and matchConstrApp now delegates to the existing getLFuncCall from Lambda.Factory instead of reimplementing application decomposition. The Lambda evaluator could adopt DatatypeInfo in the future to replace its ad-hoc tester/selector dispatch.

@joscoh
Copy link
Copy Markdown
Contributor

joscoh commented Apr 22, 2026

Two general comments (I haven't looked at this in close detail yet).

  1. I would prefer to see something that pulls the datatype simplification out of the partial evaluator into a separate function so that it can be called both in the Lambda partial evaluator and in the Core -> GOTO transformation. Right now, this PR results in two separate implementations of the same thing, and this means we need to keep them in sync.
  2. The proofs of correctness should not be given with respect to a new equivalence relation (and it is better to not include a proof than to include a sorry one). The datatype simplification is already proved correct wrt the Lambda denotational semantics in LExprSemanticsConsistent.lean. I would prefer to modify this proof and prove this transformation correct wrt an already existing semantics. I am happy to help with this once the datatype simplification has been fully abstracted.

@aqjune-aws
Copy link
Copy Markdown
Contributor

For the top-level statement of correctness, github.com/strata-org/Strata/blob/main/Strata/Transform/Specification.lean has a few styles and I think Overapproximates is probably the closest definition. https://github.com/strata-org/Strata/blob/main/Strata/Transform/DetToKleeneCorrect.lean#L637-L640 is an example that uses the definition. However, since this is supposed to be equivalent, I can add the notion in a separate pull request (or even directly push to this PR, if you want).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants