Skip to content

interactive: scope/region builders#751

Closed
frankmcsherry wants to merge 9 commits into
pr-explain-correctnessfrom
pr-scope-builders
Closed

interactive: scope/region builders#751
frankmcsherry wants to merge 9 commits into
pr-explain-correctnessfrom
pr-scope-builders

Conversation

@frankmcsherry

Copy link
Copy Markdown
Member

Adds scope structure + a lifetime to builder to avoid timestamp index errors.
Specifically, the manual-depth-arithmetic bug class behind the level errors in the previous PR.

What

  • Program::regions() — explicit per-node) from
    the Scope/EndScope markers, plus an iterative-vs-structural tag per
    region. Structured companion to
  • region_builder — lifetime-bounded builder whose generative-invariant brand
    makes a scope leak or a missing proven by
    compile_fail doctests); tracks Shape{k,v}; ports reach as a worked
    example.
  • Region-IR kickoff doc.

Why

The two level errors were manual dn. This makes
scope discipline structural (the type enforces well-formedness) and is the
foundation for a scope-oriented IR

Notes

  • Additive — new modules + one lib.rs line; no existing behavior changes.

Add `import "name"` and `export "name" = expr` to the IR and both surface
syntaxes, and remove the `result` statement: `export` is now the sole way to
declare a program output, and only at the root scope. `Program.export` is a
named list (was a single result id); `Node::Import` resolves against a
registry at install time.

`Node::Import` is a stub outside the server (the example renderers panic on
it, no example uses it); the intended end-state is one named-source substrate
that subsumes `Input(usize)` — there should not be two ways to bring in a
source. `survey_sources` (was `count_inputs`) returns both kinds until that
cutover.

Threaded through lower, both parsers, explain, and the ddir_vec/ddir_col/
dump_explain examples; example programs now use `export "result" = …`. The
string lexer is read-until-quote (no escapes — names don't use them).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@frankmcsherry frankmcsherry force-pushed the pr-explain-correctness branch from 487ce5a to 97822f4 Compare June 8, 2026 14:31

@frankmcsherry frankmcsherry left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes for Claude.

Comment thread interactive/src/ir.rs Outdated
Comment on lines +95 to +97
/// `iterative[r]` is true when region `r` carries an iteration feedback (a
/// `Variable`); a region with none is purely *structural* — a nesting
/// boundary that does not advance the timestamp. The region-typed builder

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds error-prone, in that the presence/absence of a variable determines timestamp offsetting, it sounds like. Removal of a variable could invalidate timestamp offset math. If that's correct, I think we should aim for something more explicit, as in TD/DD where region explicitly doesn't adjust the timestamp. Similarly, I think maybe Regions is a poor name for scopes that are not the same as regions in TD/DD (they will be implemented with regions, because the timestamp does not change in our implementation, but .. is "scopes" a bad name for some reason?).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — renamed throughout: RegionId->ScopeId, Regions->ScopeTree, regions()->scopes(), the region_builder module/file->scope_builder, and the brand token Region->Scope. ('region' would also collide with timely's narrower meaning, so 'scope' is the consistent umbrella.) Done in f54b958; #750 also picked up two comment refs that pointed at the builder.

frankmcsherry and others added 8 commits June 8, 2026 19:59
…nsound

Two depth/index bugs in the reverse-tracing rewrite, both surfacing only
at user-scope depth >= 2 (reach and single-scope programs were spared):

1. Projection arity counted field-exprs, not the sum of field widths.
   `Pos(r)` is a whole-row reference of width = input row r's arity, so a
   join projection like `($0 ; $1, $2)` over a 2-column value row has width
   3, not 2. The undercount shifted the user-iteration coord one slot in
   the lossy/keyed lookups, so `filter_time_and_strip` compared a data
   field against the output time and rejected every contribution -- SCC's
   demand-set came back empty. `proj_arity` now expands each field.

2. `filter_time_and_strip` aligned the `user_in <= user_out` time filter
   from index 0, but user_chain is innermost-first; crossing a `Leave`
   drops the innermost coord, so the shared scopes are the outer ones.
   Compare at the outer ends instead. This repaired depth-2 min-label
   explanation (a nested connected-components reproducer).

With both fixes, querying an SCC-internal edge returns a demand-set that
regenerates it (the whole cycle, not just the edge). reach and nested CC
remain sound; interactive lib tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The reverse rewrite folds iteration time into a demand row's value (the
user_chain). Hand-indexing those columns produced SCC explain's off-by-one:
comparing a just-left inner scope's coord against an enclosing scope's,
instead of aligning at the shared outer ends.

New `folded::Joined` owns that arithmetic — `time_le` (outer-end aligned)
and `strip` — in one place, with tests pinning the alignment (the depth-2
cross-Leave case yields user_in[1] vs user_out[0], not [0] vs [0]).
`explain::filter_time_and_strip` now delegates to it, so the real rewrite
uses the single correct implementation and future scope-builder ports can't
re-introduce the miscount. SCC explanation verified unchanged (same 7-edge
demand for the 4->1 query).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A Linear[Project] reverse via emit_lookup_lossy joined the demand against the
witness keyed on K_out *only*. When a re-key drops a field from the key
(e.g. `by_b = tc | key($0[1]; $0[0])`, keying tc by its destination), that
key-only match recovered every input row sharing the coarsened key — all
same-destination pairs regardless of source. Those siblings then resolved to
their own input edges, so explaining a multi-hop `tc(0,1)` pulled spurious
dead-end edges (e.g. `5→3`, `2→1`, `3→1`) that aren't on any path 0⇝1.

Fix: extend the existing pure-map shortcut from depth-0 projects to same-scope
projects of any depth. A Linear[Project] doesn't cross a scope boundary, so the
input's user_chain equals the output's (in_user_len == out_user_len ==
keep_in_len when the input isn't a Leave); when the projection is also total
(every input field recoverable from the output) the whole reverse is a direct
map from dep_y that narrows by *all* demanded fields — key and value — so the
source is pinned and the siblings vanish.

Verified: plain TC `0→1` → {0→3,3→1} (was {0→3,3→1,5→3}); shortest-path TC
`0→2`@2 → {0→1,1→2} (was +{2→1,3→1}); both sound (reproduce the query). SCC is
structurally unaffected (its `filter|key` projects are non-total, `trans` is
depth-0) and still explains internal edges soundly; reach unaffected (no
total depth>0 lossy project). Lib tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
EnterAt is a data→time lift (sets a new innermost user_chain coord
t_in = delay($field)), so its output is one scope deeper than its input and
the correct reverse would drop that coord 1:1. Today the reverse is a
pass-through: tenable only because depths() is positional and treats EnterAt
as depth-neutral, so the entry coord is stripped UNCONSTRAINED by the
neighboring Project that crosses the scope boundary (no time filter, since
in_len=0). Net: the input demand is a superset — sound (kept so by the
semijoin(actual_input) at seeding) but over-broad (inflates e.g. SCC label
cones); it never drops a needed edge.

Not fixing now (the proper fix is structural — make depths() let EnterAt own
its level and the reverse a coord-dropping projection). Instead: split EnterAt
out of the Negate pass-through arm with a TRIAGE note, and cross-reference it
from Program::depths(), so a later pass can find it. Behavior unchanged; lib
tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `Program::scopes()` — a computed view (like `depths()`, no constructor
change) that surfaces the scope structure the IR only encodes implicitly via
`Scope`/`EndScope` markers: a scope tree (`parent`/`children`), a per-node
scope assignment (`of_node`, same scope convention as `depths()`), and an
`iterative` tag per scope (true iff it carries a `Variable` feedback; false
for a purely structural boundary).

This is the substrate for the scope-typed builder (compile-time scoping
discipline) and for drawing scope structure in the viz. Non-breaking;
includes a unit test on a single nested iterative scope.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A builder facade that enforces scoping discipline at compile time. Each
`scope(...)` introduces a fresh, invariant brand `'r`; collections built
inside are `Coll<'r>` and can leave only through `leave`, whose `Left<'r>`
result is the only value a scope body may return. This makes the two level
errors behind the explanation bugs unrepresentable:

- forgetting to leave a scope is a type error (body must return Left<'r>);
- an inner Coll<'r> cannot escape to an outer scope ('r is generative).

Both are proven by `compile_fail` doctests (they pass = the bad programs are
rejected). A unit test builds a nested iterative scope and checks the
resulting IR via `scopes()`.

Maps to the existing iterative-scope IR for now; the structural (no-coord)
variant and outer->inner `enter` (which needs a borrow-based rather than
generative relation) are the next refinements.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Unblock porting real programs onto the lifetime-bounded builder:

- `scope(parent, imports, body)` now takes outer collections to use inside;
  each is re-branded to the body's fresh `'c`. Sound — `Coll` has no public
  constructor from a raw id, so the re-brand happens only here, on handles
  the caller already holds at the outer scope.
- Added `arrange`/`join`/`reduce` so iterative programs are expressible.
- `ports_reach` test builds the full reach program through the facade (outer
  edges/roots imported into the loop, var bound, result leaved) and checks
  the scope structure via `scopes()`.

The two compile_fail discipline proofs still hold under the new signature:
inner collections can't escape, and a scope body must `leave`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Carry Shape {k,v} on every Coll, computed at each op via the same
width-summing logic the IR uses (explain::arities, now pub(crate)). A
projection field is a whole-row reference of width = that row's arity, not
one column — the miscount that made SCC's explanation unsound. Building
through the facade can't reintroduce it: `join_arity_sums_field_widths`
asserts a whole-row-Pos join projection yields v=3, not 2.

Also adds shape-consistency asserts (concat shapes equal, join key arities
match, bind variable/value shapes match) and `arrange`/`join`/`reduce` ops;
the reach port now threads shapes through the loop. Lifetime discipline
(leave-required, inner confinement) unchanged — both compile_fail proofs
still hold.

Next: fold-depth typing for the time-as-data user_chain (the other unsound
miscount), then porting reverse rules onto the builder.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@frankmcsherry frankmcsherry force-pushed the pr-explain-correctness branch from 97822f4 to 8ed45ba Compare June 9, 2026 00:01
@frankmcsherry frankmcsherry force-pushed the pr-explain-correctness branch from 8ed45ba to f19e847 Compare June 9, 2026 00:49
@frankmcsherry

Copy link
Copy Markdown
Member Author

Closing as not quite ready yet. Good idea, but it isn't wired up beyond tests. Will return with a new PR once this has been wired up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant