Skip to content

spike(rows-first): two-message state_change protocol behind env gate#787

Closed
paddymul wants to merge 3 commits into
feat/scoped-sd-merged-prefix-v2from
spike/state-change-rows-first
Closed

spike(rows-first): two-message state_change protocol behind env gate#787
paddymul wants to merge 3 commits into
feat/scoped-sd-merged-prefix-v2from
spike/state-change-rows-first

Conversation

@paddymul
Copy link
Copy Markdown
Collaborator

@paddymul paddymul commented May 20, 2026

Draft / spike — not for merge. Validates the wire shape for "show rows first, compute stats after" without going async. Synchronous server, just message reordering + an IOLoop.call_later yield.

Sits on top of #785 (uses _defer_summary_sd + recompute_summary_sd hooks that this PR adds to the dataflow).

What it proves

  • A single buckaroo_state_change produces two initial_state frames on the wire.
  • An infinite_request fired by the client between the two frames is serviced and its parquet rows arrive before phase 2. The call_later(0.01) between phase 1 and phase 2 gives inbound socket reads a real time window to win the race.

Test: tests/unit/server/test_rows_first_spike.py (2 cases).

How to try it

Set the env flag and exercise normally:

BUCKAROO_ROWS_FIRST_SPIKE=1 uv run ...

Default (flag off) keeps today's single-frame rebroadcast. Existing test_load_expr tests pass unchanged.

Open questions

  • xorq path coverage. XorqServerDataflow may not route through the _defer_summary_sd short-circuit on the base DataFlow. First spike test had to drop a "stats actually differ between phases" assertion because it was failing on xorq. This is the most important item to resolve before un-gating — without that assertion the spike test would still pass if recompute_summary_sd() were a no-op, so we can't detect regression of the spike's whole point.
  • Inter-phase state_change race. Between phase 1 broadcast and the call_later firing, a second state_change can land. Phase 2 then computes stats against a processed_df that already reflects the second state, but the broadcast payload mixes phase-2 stats with what the user thinks is the first interaction. The spike doesn't try to address this; needs a serialise / cancel-previous strategy before un-gating.
  • Delay tuning. 10ms is a localhost-only number — loopback WS RTT is well under 1ms. For remote deployments (RTT 30-100ms+) 10ms is shorter than the round-trip and the spike's premise breaks. Real prod tuning must measure end-to-end and pick a value that reliably loses the race to a network-bound infinite_request.
  • Cache-hit fast path. When filt_sd_key for the new state is already in summary_stats_cache (search-box backspace, oscillation), there's no compute to defer — phase 2 would be near-instant. The spike doesn't short-circuit and emits two frames anyway, which is wasteful.
  • Phase-2 message type. Both phases currently send type: "initial_state". JS will re-apply full state twice. Worth deciding whether phase 2 should be a distinct type (e.g. stats_update) so the front-end can target a cheaper re-render of just the stats-driven UI, rather than re-running the full initial_state handler.
  • No before/after timing. PR currently asserts the protocol works but doesn't measure whether it produces a perceptible win on a realistic dataset. One A/B measurement (rows-rendered latency, with/without flag, on a search that triggers a real compute) would justify or kill the protocol.

Stack

🤖 Generated with Claude Code

paddymul and others added 3 commits May 20, 2026 18:38
Proves the wire shape for "show rows first, compute stats after"
without going async. Synchronous server, just message reordering +
an ``IOLoop.call_later`` yield between the two ``initial_state``
frames so any ``infinite_request`` from the client gets serviced in
between.

Gated on ``BUCKAROO_ROWS_FIRST_SPIKE=1``. Default off so existing
``test_load_expr`` tests (which expect a single rebroadcast frame
per state change) keep working unchanged.

Mechanism:

1. State-change handler sets ``dataflow._defer_summary_sd = True``,
   then extracts state. The trait cascade still runs (``cleaned`` →
   ``processed_result`` → ``populate_df_meta``), but ``_summary_sd``
   short-circuits — the analysis-pipeline pass is skipped.
2. Phase 1 ``initial_state`` is sent immediately.
3. ``IOLoop.call_later(0.01, _send_stats_update)`` schedules the
   deferred compute. The 10ms delay gives any ``infinite_request``
   the client fires in response to phase 1 a real time window to
   arrive and get serviced before the stats compute starts.
4. Phase 2: ``recompute_summary_sd()`` triggers the analysis run
   (which fires ``_populate_sd_cache`` + ``_merged_sd`` downstream),
   then a second ``initial_state`` is sent.

Spike tests in ``tests/unit/server/test_rows_first_spike.py``:

- ``test_state_change_emits_two_initial_state_messages``: pins the
  two-message wire shape.
- ``test_infinite_request_between_phases_returns_rows``: pins the
  key win — ``infinite_request`` between the two phases is serviced
  and its parquet frame arrives before phase 2.

Not for merge as-is — this is exploration on top of #785 to validate
the protocol before doing a real (un-gated) implementation. Open
questions:

- xorq path: ``XorqServerDataflow`` may not route through this
  ``_defer_summary_sd`` short-circuit; the first test had to drop a
  "stats actually differ between phases" assertion that was failing
  on the xorq backend.
- 10ms delay is arbitrary. Real prod tuning would measure
  end-to-end and pick a value that reliably loses the race to a
  network-bound ``infinite_request`` without feeling sluggish.
- Cache hit fast path: when ``filt_sd_key`` is already in
  ``summary_stats_cache`` (search-box backspace etc.), there's no
  reason to two-message. Spike doesn't short-circuit that.

Sits on top of PR #785.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review-driven cleanup on the spike, no behaviour change:

- Hoist ``from tornado.ioloop import IOLoop`` out of
  ``_handle_buckaroo_state_change`` to the module top (CLAUDE.md
  bans in-function imports).
- Extract ``_apply_state_and_broadcast`` so phase 1 and phase 2
  share one copy of session-update + component_config merge +
  per-client send. Phase 1 still owns the
  ``session.buckaroo_state = new_state`` assignment.
- Rewrite the ``_ROWS_FIRST_SPIKE_PHASE2_DELAY_S`` comment so it
  doesn't claim 10ms beats "typical browser-to-server WS round-trip"
  (only true for localhost). Flag the remote-deployment caveat.
- Drop the dead ``monkeypatch.setenv`` in the spike test fixture —
  the module-level constant is resolved at import, so only the
  ``setattr`` actually flips the gate.

Spike tests still pass; the flag-off ``test_load_expr`` suite is
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@paddymul
Copy link
Copy Markdown
Collaborator Author

Closing per triage: superseded by the JS-driven progressive-stats design (#808/#809/#810). plans/0787-xorq-rows-first-coverage.md (in #790) commits to that path.

@paddymul paddymul closed this May 21, 2026
paddymul added a commit that referenced this pull request May 21, 2026
…kips histogram on filt+clean

Adds a ``DataFlow.skip_stats_by_scope`` class attribute that lets a
backend declare "don't run stat X when computing scope Y's SD". The
motivating case (and only override shipping here): ``XorqDataflow``
sets ``{'filt': {'histogram'}, 'clean': {'histogram'}}`` — on xorq
the per-column histogram queries dominate state_change latency on
remote tables, and the user only renders the bare (raw) histogram in
``merged_sd``. Running the histogram on filt + clean too was wasted
work that didn't show up in the UI. Pandas / polars dataflows keep
the default empty dict, so their filt/clean histograms continue to
appear.

## Plumbing

``skip_stat_names`` threads from the dataflow into the underlying
pipelines:

- ``_summary_sd`` (filt scope) — applies filt skip when
  ``quick_command_args`` is truthy (the cascade hasn't updated
  ``operations`` yet when this fires, so we can't use the chains).
- ``_populate_sd_cache`` (raw + clean scopes) — uses
  ``_effective_skip(scope, chains)`` which returns ``None`` when the
  scope is degenerate (filt == clean → no filter; clean == raw → no
  cleaning). Keeps the no-filter / no-cleaning case collapsing raw +
  clean + filt under one cache key so the lazy-postprocessor
  ``add_processing(big_a)`` invariant on ``XorqBuckarooInfiniteWidget``
  still holds (postprocessor runs once per state, not three times).
- ``_scope_cache_key`` incorporates the effective skip so the cache
  doesn't collide raw and filt entries when their skips differ.
- ``StatPipeline.process_column`` / ``process_df`` /
  ``process_df_v1_compat``: skip at iteration (stat-func name match)
  plus a final output-level strip (covers v1 ``ColAnalysis`` wrappers
  where one ``StatFunc`` provides many keys — ``DefaultSummaryStats__
  series`` provides ``mean``, ``max``, ``std``, ...; we can't skip
  the whole wrapper for one key).
- ``XorqStatPipeline.process_table`` / ``process_table_v1_compat``:
  skip in both the batch-aggregate phase and the per-column phase.
- ``DfStatsV2``, ``PlDfStatsV2``, ``XorqDfStatsV2``: accept and forward.
- ``DfStats`` (v1) accepts the kwarg for API parity but strips at the
  output level since v1 ``AnalysisPipeline`` doesn't have per-stat
  filtering.

## Tests (failing-then-fix split per repo TDD policy)

Five new tests in the previous commit, all green now:

- ``test_skip_stats_by_scope_excludes_named_stat_from_filt`` — skip
  ``mean`` on filt; ``filtered_mean`` absent, bare ``mean`` present,
  other ``filtered_*`` keys present.
- ``test_skip_stats_by_scope_excludes_from_raw_and_clean`` — skip on
  raw/clean; bare ``mean`` absent, ``filtered_mean`` present.
- ``test_skip_stats_by_scope_default_empty_runs_all_stats`` — no
  override means behaviour unchanged.
- ``test_xorq_dataflow_skips_histogram_on_filt_and_clean`` — pins
  XorqDataflow's default.
- ``test_polars_dataflow_keeps_histogram`` — Polars side still has
  empty skiplist.

## Regression

Full ``tests/unit/`` suite green (1049 passed, 6 skipped) — including
``TestLazyPostprocessor`` on ``XorqBuckarooInfiniteWidget`` which
caught the early version of the cache-key change that recomputed
raw+clean scopes whenever skip differed from filt.

Adapted from the working draft on smorg/post-785-playground
(``4949e56c``), stripped of references to the closed #787 spike
(``recompute_summary_sd`` / ``_defer_summary_sd``) and the
not-yet-merged #809 (``cost_classes``).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

triaged Reviewed and triaged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant