Skip to content

feat(#533): optimize FlatExpression ExprNode representation#541

Merged
dadhi merged 7 commits intomasterfrom
copilot/533-work-on-pr
Apr 29, 2026
Merged

feat(#533): optimize FlatExpression ExprNode representation#541
dadhi merged 7 commits intomasterfrom
copilot/533-work-on-pr

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 29, 2026

Implements the key optimization items from issue #533 "FlatExpression: optimize representation".

Changes

ExprNode struct layout (the main change)

_data (ulong, 8 bytes) is split into two uint fields that still fit in the same 24-byte struct:

Field Offset Size Content
Type 0 8B runtime type
Obj 8 8B payload / marker
_meta 16 4B NodeType(8b) | Tag(8b) | NextIdx(16b)
_data 20 4B ChildCount(16b)|ChildIdx(16b) or inline constant value

Inline primitive and enum constants without boxing

Constant classification mirrors IsClosureBoundConstant from TryCollectInfo:

  • Inline in _data (no allocation): bool, byte, sbyte, char, short, ushort, int, uint, float (≤32-bit primitives) and enum types with ≤32-bit underlying type (byte/sbyte/char/short/ushort/int/uint — virtually all C# enums) are stored as raw bits directly in _data with a sentinel ExprNode.InlineValueMarker in Obj.
  • Boxed in Obj (no ClosureConstants): wide primitives (long, ulong, double), decimal, null, string, Type, and long/ulong-backed enums are stored boxed directly in Obj.
  • ClosureConstants: Delegate, array types, and user-defined types.

Node layout annotations

In-code comments document the child-list layouts for Lambda and Block nodes in all three locations (factory methods, Builder, and Reader):

  • Lambda: children[0] = body, children[1..n] = parameter decl nodes. Body is stored first, enabling the out-of-order decl pattern where parameter refs in the body are encountered before their decl nodes; identity is preserved via the shared _parametersById id-map.
  • Block: with variables — children[0] = ChildList(var₀, var₁, …), children[1] = ChildList(expr₀, expr₁, …); without variables — children[0] = ChildList(expr₀, expr₁, …). children.Count == 2 is the canonical sentinel for variable presence.

Performance helpers

  • FloatBits union struct for float↔uint reinterpretation without unsafe/BitConverter, compatible with all target frameworks (net472, netstandard2.0+, net6–10).
  • FlatExpressionThrow static class with [MethodImpl(NoInlining)] throw methods so callers in hot paths can be JIT-inlined.
  • GetSurePresentRef is now used in all AddNode overloads (when setting NextIdx on known-valid child indices) and in Reader.GetChildren inner loop, avoiding redundant bounds checks.

Testing

6 new tests covering decl vs ref nodes, out-of-order decl, parameter/lambda/block identity, and enum inline storage:

  • Flat_lambda_parameter_ref_before_decl_preserves_identity — body (ref) stored before parameter (decl) in flat encoding; both resolve to same SysParameterExpression.
  • Flat_lambda_multiple_parameter_refs_all_yield_same_identity — three independent cloned ref nodes for the same parameter all produce the same object.
  • Flat_block_variables_and_refs_yield_same_identity — block variables declared first, then referenced in Assign and result expressions; all decl+ref pairs are the same SysParameterExpression.
  • Flat_nested_lambda_captures_outer_parameter_identity — inner lambda closing over an outer parameter: the ref in the inner body is the same object as the outer lambda's parameter decl.
  • Flat_out_of_order_decl_block_in_lambda_compiles_correctly — end-to-end compile-and-run test verifying correctness with a block containing two variables.
  • Flat_enum_constant_stored_inline_roundtrip — covers all six small underlying integer types (byte/sbyte/short/ushort/int/uint) with edge-case values; asserts ClosureConstants.Count == 0 (no boxing) and correct value round-trip.

All 1679 tests pass.

Copilot AI and others added 2 commits April 29, 2026 06:08
…ants

- Split _data (ulong, 8B) into _meta (uint, 4B: NodeType|Tag|NextIdx) + _data (uint, 4B)
- Inline bool/byte/sbyte/char/short/ushort/int/uint/float directly in _data without boxing (using InlineValueMarker sentinel in Obj); large values (long/ulong/double/decimal/DateTime/objects) go to ClosureConstants
- Add FloatBits union struct for float<->uint reinterpretation without unsafe code
- Add FlatExpressionThrow helper class (NoInlining throw helper to keep callers inlinable)
- Use GetSurePresentRef in AddNode overloads and Reader.GetChildren for bounds-check-free hot paths
- Update Builder.AddConstant and Reader.ReadExpression to handle new inline constant representation

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/4c535b26-3504-4773-987e-ffbd75351aa4

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
- ReadInlineValue: use data != 0 for bool (clearer, matches write pattern)
- ToInlineValue: throw on unsupported TypeCode instead of returning 0u
- FloatBits: rename _f/_u to _floatValue/_uintValue for clarity
- Add FlatExpressionThrow overload for ToInlineValue error path

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/4c535b26-3504-4773-987e-ffbd75351aa4

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
return AddRawExpressionNode(type, value, ExpressionType.Constant);

var tc = Type.GetTypeCode(type);
if (IsSmallPrimitive(tc))
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use same check as in FastExlressionCompiler in TryCollectInfo for constants on whether the constant goes ClosureConstants or no, e.h. long should not go to vlosure constant and should for now be stored Obj boxed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3ba217f. The classification now mirrors IsClosureBoundConstant:

  • Inline in _data (no allocation): bool, byte, sbyte, char, short, ushort, int, uint, float (≤32-bit primitives)
  • Boxed in Obj (no ClosureConstants): long, ulong, double, decimal, null, string, Type, enums
  • ClosureConstants: Delegate, arrays, and user-defined types

- Primitives that fit in 32 bits (bool/byte/sbyte/char/short/ushort/int/uint/float) still use inline _data
- Wide primitives (long/ulong/double) are stored boxed in Obj — NOT in ClosureConstants
- decimal, null, string, Type, enum also stay in Obj
- Only Delegate, arrays, and user-defined types go to ClosureConstants
  (matches TryCollectInfo's IsClosureBoundConstant: !IsPrimitive && !IsEnum && !string && !Type && !decimal)

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/5c0eee4e-2b40-466f-af3a-40eb078c0a9d

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Copilot AI requested a review from dadhi April 29, 2026 06:39
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to add more tests covering decl vs ref nodes, out of order decl, specifically in regards to prarameters, lambdas, blocks

…or params/lambdas/blocks

5 new tests covering:
- Flat_lambda_parameter_ref_before_decl_preserves_identity
- Flat_lambda_multiple_parameter_refs_all_yield_same_identity
- Flat_block_variables_and_refs_yield_same_identity
- Flat_nested_lambda_captures_outer_parameter_identity
- Flat_out_of_order_decl_block_in_lambda_compiles_correctly

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/7f7fb7fe-7940-4cfb-ba92-1454b7953749

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 29, 2026

I would like to add more tests covering decl vs ref nodes, out of order decl, specifically in regards to prarameters, lambdas, blocks

Added 5 new tests in e1f6662 covering decl vs ref nodes, out-of-order decl, and parameter/lambda/block identity:

  • Flat_lambda_parameter_ref_before_decl_preserves_identity — body (ref) stored before parameter (decl) in flat encoding; both resolve to same SysParameterExpression
  • Flat_lambda_multiple_parameter_refs_all_yield_same_identity — three independent refs to same param (p*p+p) all yield same object
  • Flat_block_variables_and_refs_yield_same_identity — two block variables used in assign and result expressions; decl and all refs are same object
  • Flat_nested_lambda_captures_outer_parameter_identity — inner lambda body captures outer lambda param; ref in inner body is same object as outer param decl
  • Flat_out_of_order_decl_block_in_lambda_compiles_correctly — compile-and-run verification: (p) => { v1 = p*2; v2 = v1+p; v2 }, p=3 → 9

Copilot AI requested a review from dadhi April 29, 2026 09:59
@dadhi
Copy link
Copy Markdown
Owner

dadhi commented Apr 29, 2026

@copilot Please annotate in code comments the layout of Lambda and Block declarations nodes

…Expression.cs

Add layout comments in all three locations (factory methods, Builder cases,
Reader cases) explaining:
- Lambda: children[0]=body, children[1..n]=param decls; out-of-order decl pattern
- Block: children[0]=ChildList(vars) [optional], children[last]=ChildList(exprs)
  with the children.Count==2 sentinel for variable presence

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/4646dbb8-d85d-44da-9029-87386284f968

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 29, 2026

@copilot Please annotate in code comments the layout of Lambda and Block declarations nodes

Done in 170ad90. Layout comments are now added in all three places — the factory methods (Block/Lambda), the Builder cases, and the Reader cases:

Lambda (children[0] = body, children[1..n] = parameter decl nodes) — the comment explains the out-of-order decl pattern: body is stored/read first so parameter refs in the body are encountered before the decl nodes at children[1..n], with identity preserved via the shared _parametersById id-map.

Block — the comment documents both layouts:

  • With variables: children[0] = ChildList(var₀, var₁, …), children[1] = ChildList(expr₀, expr₁, …)
  • Without variables: children[0] = ChildList(expr₀, expr₁, …)

with the children.Count == 2 sentinel as the canonical way to detect the presence of variables.

public int Constant(object value, Type type)
{
if (ShouldInlineConstant(value, type))
if (value == null || value is string || value is Type || type.IsEnum || value is decimal)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid boxing for enum here and put it inplace for enums less than 4 bytes which is I believe will be 99.9999 or all enums?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 02e3372. Enum constants whose underlying type is ≤ 32 bits (byte/sbyte/char/short/ushort/int/uint — essentially all C# enums) are now stored inline in _data via (uint)System.Convert.ToInt64(value), with no boxing and no ClosureConstants entry. The Reader reconstructs the typed enum value via Enum.ToObject using the underlying TypeCode. The extremely rare long/ulong-backed enums fall through to the existing boxed-in-Obj path.

A new test Flat_enum_constant_stored_inline_roundtrip covers all six underlying types, asserting ClosureConstants.Count == 0 and correct value round-trip for both positive and negative/edge-case enum members.

Extend the inline-constant path to cover enum types whose underlying type
is ≤ 32 bits (byte/sbyte/char/short/ushort/int/uint — virtually all C# enums).
Raw bits are stored in _data via System.Convert.ToInt64 + uint cast; the
Reader reconstructs the typed enum via Enum.ToObject using the underlying
TypeCode. Long/ulong-backed enums (extremely rare) continue to be boxed
in Obj.

Add FlatExpressionThrow.UnsupportedInlineConstantType(Type, TypeCode)
overload for the error path in ReadInlineValue.

Add Flat_enum_constant_stored_inline_roundtrip test covering all six
underlying types (byte/sbyte/short/ushort/int/uint), verifying no
ClosureConstants allocation and correct value round-trip. (1679 tests)

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/0ffda673-c511-49f5-ad08-e070318b3781

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Copilot AI requested a review from dadhi April 29, 2026 10:28
if (value == null || value is string || value is Type || value is decimal)
return AddRawExpressionNode(type, value, ExpressionType.Constant);

if (type.IsEnum)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a repeated code here and in the AddConstant
Can we move it to helper.
Actually go pass through the whole FE and do the semantic compression (https://caseymuratori.com/blog_0015) as well and while doing so you may reestablish the new method boundaries
Ensure perf is not degraded, apply aggressive inlining where it fits

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d8eb0b8.

  • Builder.AddConstant() now delegates to _tree.Constant(constant.Value, constant.Type) — the 25-line duplicate body is gone.
  • AddRawExpressionNodeWithChildIndex removed; its single call site in Constant() directly calls AddRawLeafExpressionNode.
  • [AggressiveInlining] added to all thin single-expression forwarding methods: public API shortcuts (ParameterOf<T>, Variable, Default, Constant(object), ConstantNull, ConstantInt, ConstantOf<T>, Lambda<TDelegate>, Field, Property, ArrayIndex, Convert, TypeAs, Negate, Not, Assign, Add, Equal, Block(), Goto, Return, TypeIs, TypeEqual) and all private helpers (AddFactoryExpressionNode fixed-arity overloads, AddFactoryAuxNode overloads, AddRawExpressionNode overloads, AddRawLeafExpressionNode, AddRawLeafAuxNode, AddRawAuxNode, AddObjectReferenceNode, AddChildListNode, AddUInt16PairNode, AddLeafNode, AddInlineConstantNode, CloneChild).

@dadhi
Copy link
Copy Markdown
Owner

dadhi commented Apr 29, 2026

@copilot try again

…nNodeWithChildIndex, add AggressiveInlining to thin wrappers

Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/f33cabbf-6105-4280-a4c3-5f0c238e98d6

Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 29, 2026

@copilot try again

Addressed in d8eb0b8 — constant classification deduplication, semantic compression, and [AggressiveInlining] on all thin wrappers. All 1678 tests pass.

@dadhi dadhi marked this pull request as ready for review April 29, 2026 13:41
@dadhi dadhi merged commit 1d9c840 into master Apr 29, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants