tune: lower coalesce/settle step 40 → 10 ms#674
Merged
therealaleph merged 1 commit intotherealaleph:mainfrom May 3, 2026
Merged
tune: lower coalesce/settle step 40 → 10 ms#674therealaleph merged 1 commit intotherealaleph:mainfrom
therealaleph merged 1 commit intotherealaleph:mainfrom
Conversation
3d89577 to
72ff23f
Compare
…ettle max to 1 s
The batch coalesce step controls how long the client (and the
tunnel-node's straggler settle) waits between checking for more ops
to pack into the same batch. At 40 ms the wait was conservative —
good for packing uploads but needlessly slow on the download path
where the tunnel-node round-trip, not coalescing, is the bottleneck.
Lowering the step to 10 ms means we fire batches almost immediately
when there's nothing else queued, cutting ~30 ms of dead air on
every download-dominated round-trip. When both sides DO have data
in flight (uploads, bursty page loads), the adaptive reset still
works: each arriving op resets the 10 ms step timer, so a rapid
burst naturally coalesces up to the 1 s hard cap without wasting
quota on many small batches.
In short: don't wait when there's nothing to wait for; batch
aggressively when there is.
Client side:
- DEFAULT_COALESCE_STEP_MS 40 → 10 ms
- DEFAULT_COALESCE_MAX_MS unchanged at 1000 ms
Tunnel-node side:
- STRAGGLER_SETTLE_STEP 40 → 10 ms (matches client step)
- STRAGGLER_SETTLE_MAX 500 → 1000 ms (more room to pack
straggler responses when upstream targets reply at different
speeds — saves Apps Script quota on the return leg)
Users who prefer the old behaviour can set "coalesce_step_ms": 40
in config.json.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
72ff23f to
71d774c
Compare
Owner
|
@yyoyoian-pixel — reasoning is sound and I tested locally (179 lib + 33 tunnel-node tests green). The asymmetric design (small step, generous max) is the right framing — fast-fire when nothing else is queued, but adaptive coalesce on bursts. Merging. Will ship in v1.9.8 with a note that timing-sensitive deployments can override via [reply via Anthropic Claude | reviewed by @therealaleph] |
therealaleph
added a commit
that referenced
this pull request
May 3, 2026
…apps_script modes Android (#666 from @ilok67 with full root cause): - MainActivity.onStop was sending ACTION_STOP via startService() AND immediately calling stopService() on the same service. ACTION_STOP runs teardown() on a background thread that stopSelf()s at the end; the redundant stopService() triggered onDestroy() in parallel, racing the lifecycle and crashing on every Disconnect tap. Removed the stopService() — ACTION_STOP alone is sufficient for both the live-service and the zombie-after-process-death cases. The tornDown AtomicBoolean already guards against double-teardown of native state but couldn't protect against OS-level stopSelf vs stopService race. UI (#665 from @cmptrnb): - Test Relay button was showing red "test result: fail" status when used in full or direct mode. The underlying test_cmd::run deliberately refuses in those modes because probing Apps Script directly while the data plane goes via tunnel-node would give a misleading result, but the refuse path was getting translated to generic "test failed". UI now checks mode before running and shows a mode-specific explainer for full/direct (point users at https://whatismyipaddress.com in the browser via the proxy as the right way to verify). Includes already-merged PR #674 from @yyoyoian-pixel: drop client coalesce_step + tunnel-node straggler settle_step from 40 ms → 10 ms, raise tunnel-node settle max from 500 ms → 1000 ms. Asymmetric tuning: fast-fire when nothing else is queued, but adaptive coalesce on bursts. Backwards compatible — existing configs with explicit `coalesce_step_ms: 40` keep old behavior. Tests: 179 lib + 33 tunnel-node green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lower the adaptive coalesce step (client) and straggler settle step (tunnel-node) from 40 ms to 10 ms. Raise the tunnel-node settle max from 500 ms to 1000 ms.
Rationale
The batch pipeline has two phases where data can accumulate:
The 40 ms step was conservative: it gave each phase a wide window to pack more data per batch, saving Apps Script round-trips. But this window is wasted on downloads — when the client is just waiting for data and has nothing new to send, each 40 ms step is pure dead air before the batch fires.
The fix is asymmetric by design:
Step: 40 → 10 ms — When there's nothing else to pack, fire almost immediately. The 10 ms still catches ops that land in the same event-loop tick (e.g. a browser opening 6 parallel connections on page load), so we don't degenerate into single-op batches on a burst.
Max (client): stays at 1000 ms — When both sides do have data (uploads, bursty page loads), the adaptive reset keeps packing: each arriving op resets the 10 ms step timer, so a rapid burst naturally coalesces up to the 1 s cap. This saves quota by packing many ops into fewer round-trips.
Settle max (tunnel-node): 500 → 1000 ms — More room to pack straggler upstream replies when targets respond at different speeds. One slow CDN shouldn't force a premature flush that wastes a whole Apps Script call for the late reply.
In short: don't wait when there's nothing to wait for; batch aggressively when there is.
Changes
coalesce_step_mscoalesce_max_msSTRAGGLER_SETTLE_STEPSTRAGGLER_SETTLE_MAXBackwards compatibility
Users with
"coalesce_step_ms": 40in config.json keep the old behaviour — the compiled default only applies when the field is absent or 0.Test plan
🤖 Generated with Claude Code