interactive/explain: correctness fixes#750
Open
frankmcsherry wants to merge 4 commits into
Open
Conversation
365aca6 to
f267717
Compare
2da5536 to
487ce5a
Compare
…nsound Two depth/index bugs in the reverse-tracing rewrite, both surfacing only at user-scope depth >= 2 (reach and single-scope programs were spared): 1. Projection arity counted field-exprs, not the sum of field widths. `Pos(r)` is a whole-row reference of width = input row r's arity, so a join projection like `($0 ; $1, $2)` over a 2-column value row has width 3, not 2. The undercount shifted the user-iteration coord one slot in the lossy/keyed lookups, so `filter_time_and_strip` compared a data field against the output time and rejected every contribution -- SCC's demand-set came back empty. `proj_arity` now expands each field. 2. `filter_time_and_strip` aligned the `user_in <= user_out` time filter from index 0, but user_chain is innermost-first; crossing a `Leave` drops the innermost coord, so the shared scopes are the outer ones. Compare at the outer ends instead. This repaired depth-2 min-label explanation (a nested connected-components reproducer). With both fixes, querying an SCC-internal edge returns a demand-set that regenerates it (the whole cycle, not just the edge). reach and nested CC remain sound; interactive lib tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The reverse rewrite folds iteration time into a demand row's value (the user_chain). Hand-indexing those columns produced SCC explain's off-by-one: comparing a just-left inner scope's coord against an enclosing scope's, instead of aligning at the shared outer ends. New `folded::Joined` owns that arithmetic — `time_le` (outer-end aligned) and `strip` — in one place, with tests pinning the alignment (the depth-2 cross-Leave case yields user_in[1] vs user_out[0], not [0] vs [0]). `explain::filter_time_and_strip` now delegates to it, so the real rewrite uses the single correct implementation and future region-builder ports can't re-introduce the miscount. SCC explanation verified unchanged (same 7-edge demand for the 4->1 query). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A Linear[Project] reverse via emit_lookup_lossy joined the demand against the
witness keyed on K_out *only*. When a re-key drops a field from the key
(e.g. `by_b = tc | key($0[1]; $0[0])`, keying tc by its destination), that
key-only match recovered every input row sharing the coarsened key — all
same-destination pairs regardless of source. Those siblings then resolved to
their own input edges, so explaining a multi-hop `tc(0,1)` pulled spurious
dead-end edges (e.g. `5→3`, `2→1`, `3→1`) that aren't on any path 0⇝1.
Fix: extend the existing pure-map shortcut from depth-0 projects to same-scope
projects of any depth. A Linear[Project] doesn't cross a scope boundary, so the
input's user_chain equals the output's (in_user_len == out_user_len ==
keep_in_len when the input isn't a Leave); when the projection is also total
(every input field recoverable from the output) the whole reverse is a direct
map from dep_y that narrows by *all* demanded fields — key and value — so the
source is pinned and the siblings vanish.
Verified: plain TC `0→1` → {0→3,3→1} (was {0→3,3→1,5→3}); shortest-path TC
`0→2`@2 → {0→1,1→2} (was +{2→1,3→1}); both sound (reproduce the query). SCC is
structurally unaffected (its `filter|key` projects are non-total, `trans` is
depth-0) and still explains internal edges soundly; reach unaffected (no
total depth>0 lossy project). Lib tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
EnterAt is a data→time lift (sets a new innermost user_chain coord t_in = delay($field)), so its output is one scope deeper than its input and the correct reverse would drop that coord 1:1. Today the reverse is a pass-through: tenable only because depths() is positional and treats EnterAt as depth-neutral, so the entry coord is stripped UNCONSTRAINED by the neighboring Project that crosses the scope boundary (no time filter, since in_len=0). Net: the input demand is a superset — sound (kept so by the semijoin(actual_input) at seeding) but over-broad (inflates e.g. SCC label cones); it never drops a needed edge. Not fixing now (the proper fix is structural — make depths() let EnterAt own its level and the reverse a coord-dropping projection). Instead: split EnterAt out of the Negate pass-through arm with a TRIAGE note, and cross-reference it from Program::depths(), so a later pass can find it. Behavior unchanged; lib tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
f267717 to
e5d28c3
Compare
487ce5a to
97822f4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Correctness fixes for the explanation rewrite (
explain.rs), whichreverse-traces a program to a per-input demand-set.
Fixes
errors, both invisible at scope depth ≤1 (so reach/CC were spared):
projection arity counted field-expressions instead of summing field
widths; and
filter_time_and_ user_outfilter from index 0, but
user_chainis innermost-first so the sharedscopes are the outer ends.
user_chaintime algebra in a newfoldedmodule(
Joined::time_le/strip) — t tested place,locking out the bug class.
drops a field from the key recovered every input row sharing the
coarsened key, leaking spuriousws by key
and value.
Not a fix — documented for triage
EnterAtreverse is a knownr-approximation(drops a data-derived time coord unconstrained). Documented inline
(
OVER-APPROXIMATION/TRIAGEing so it isn'tread as a claimed fix.
Soundness check throughout: run forward on the demand-set, confirm it
regenerates the queried output.