Skip to content

fix(h2): make Client multiplex on h2 (#4143)#5362

Merged
mcollina merged 13 commits into
mainfrom
repro/h2-pipelining-default-issue-4143
Jun 6, 2026
Merged

fix(h2): make Client multiplex on h2 (#4143)#5362
mcollina merged 13 commits into
mainfrom
repro/h2-pipelining-default-issue-4143

Conversation

@mcollina

@mcollina mcollina commented Jun 5, 2026

Copy link
Copy Markdown
Member

Summary

Draft — fixes part of #4143. Makes h2 multiplexing actually work on a single Client (and on a Pool capped to connections: N). Before the fix, even though h2 was enabled by default in 8.0.0, a single h2 Client was capped at one in-flight stream by the h1 pipelining factor — completely defeating multiplexing on a shared session.

The companion problem — default Pool/Agent cold-burst fan-out (each concurrent request during ALPN opens its own TCP socket and h2 session) — is not fixed here. An earlier iteration of this PR did fix it via "may negotiate h2" speculation in Client.busy/getMaxConcurrent, but that regressed h1-over-https throughput ~10× when ALPN turned out to be h1, so it was reverted. Resolving the cold-burst case needs a Pool-level lazy-connect strategy and is left for a follow-up.

Two distinct settings, two gates

Per the public docs:

  • pipelining — HTTP/1.1 RFC7230 concurrent-request factor on one TCP/TLS connection (default 1).
  • maxConcurrentStreams — HTTP/2 multiplexing ceiling (default 100, server-overridable via SETTINGS).

The fix routes the h2 dispatch gate through maxConcurrentStreams and preserves pipelining as h1-only.

What was wrong

resume() and Client.busy (lib/dispatcher/client.js:619, :339) gated dispatch through getPipelining(client) regardless of protocol — so the h1 default of 1 capped a single h2 Client at one in-flight stream, completely defeating multiplexing on a shared session.

Fix

lib/dispatcher/client.js:

  • Introduces getMaxConcurrent(client) — protocol-aware dispatch ceiling. Returns kMaxConcurrentStreams once an h2 context is attached; otherwise getPipelining().
  • Client.busy and the resume() loop now consult getMaxConcurrent, not getPipelining.
  • Client.busy also suppresses the kPending > 0 fan-out signal when an h2 context is attached, so subsequent dispatches multiplex onto the existing session.
  • Client#pipelining keeps its h1 semantics (default 1, reflects only the user-set h1 factor). h1 keep-alive paths and h1-over-https are untouched.

Queued requests on a Client whose context hasn't attached yet drain in one batch once ALPN resolves — h2 multiplexes up to maxConcurrentStreams, h1 drains through the pipelining factor.

Tests (TDD)

test/http2-pipelining-default.js:

  1. h2 client multiplexes concurrent requests by default — fires N concurrent client.request calls against an h2 server through a single Client, asserts peakInFlight === N. Before the fix: 1.
  2. Pool with connections=1 multiplexes h2 streams on the single session — fires N concurrent pool.request calls at Pool({ connections: 1 }), asserts the server saw 1 h2 session and 1 TCP socket. The test comments explicitly leave the unconstrained-Pool case for a follow-up.
  3. Client#pipelining keeps its h1 (RFC7230) semantic on an h2 client — locks in the public contract: even after negotiating h2, client.pipelining === 1 and remains user-settable.

test/http2-abort.js (#2364 - Concurrent aborts, both variants): made defensive against ERR_HTTP2_INVALID_STREAM and post-test ECONNRESET — those crashes are now reachable because the concurrent requests truly multiplex. Cleanup moved into the test body so the resulting session teardown lands while the test is still running.

Repro — where the fix dominates

The standing repro file (repro-h2-pipelining-default.mjs) — Pool({ connections: 1 }), 5 concurrent requests, 1 s server delay — is the real-world AWS-SDK shape: a single Client with many concurrent dispatches.

Config Pre-PR This PR
default (allowH2: true, p: 1) 1067 ms, sockets 5, sessions 5 1065 ms, sockets 5, sessions 5 (follow-up)
connections: 1, p: 1 (default) 5032 ms, 1 stream slot — serialized 1015 ms — multiplexed ← ~5×
connections: 1, pipelining: 100 1023 ms (workaround) 1009 ms

The middle row is the AWS-SDK-shaped failure mode — restored to working order.

Benchmarks

Captured on Node 24.14.1, 50 connections, pipelining depth 10. undici - dispatch previously errored on every recent run because SimpleRequest in the bench files used the legacy handler API — also migrated to the current onRequestStart / onResponseStart / onResponseData / onResponseEnd / onResponseError shape so it actually runs. Added an npm run bench:https script so the h1-over-TLS path is reproducible.

Before / after — undici rows only

The headline connections=50, parallelRequests=100 benchmark fans out across 50 Clients regardless of the per-Client multiplex ceiling, so it hides the per-connection serialization that the bug caused — that's why the deltas are small. The repro above is where the fix dominates.

HTTP/1.1 (path structurally unchanged — confirms no regression):

Workload Pre-PR This PR Δ
undici - fetch 5660 5438 −3.9%
undici - pipeline 12800 13470 +5.2%
undici - request 16846 16850 ±0
undici - stream 18369 18488 +0.6%
undici - dispatch 20802 20786 ±0

HTTP/1.1 over HTTPS (this is the row the earlier speculation iteration of the PR regressed 11×; after the revert it sits within noise):

Workload Pre-PR This PR Δ
undici - fetch 3608 3721 +3.1%
undici - pipeline 6362 6254 −1.7%
undici - request 6703 6669 −0.5%
undici - stream 7093 7019 −1.0%
undici - dispatch 7184 7361 +2.5%

HTTP/2 (modest gain on the workloads sensitive to single-Client serialization):

Workload Pre-PR This PR Δ
undici - fetch 3499 3499 ±0
undici - pipeline 5198 5836 +12.3%
undici - request 6825 6831 ±0
undici - stream 6977 6874 −1.5%
undici - dispatch 7391 7791 +5.4%

All deltas are within ±~3% run-to-run noise except undici - pipeline on h2 (+12%) and undici - dispatch on h2 (+5%) — both consistent with the fix removing a per-Client pipelining=1 cap that previously throttled h2 dispatches.

Full tables (this PR)

HTTP/1.1

┌────────────────────────┬─────────┬────────────────────┬────────────┬─────────────────────────┐
│  Tests                 │ Samples │ Result             │ Tolerance  │ Difference with slowest │
├────────────────────────┼─────────┼────────────────────┼────────────┼─────────────────────────┤
│  'node-fetch'          │ 50      │ '4711.86 req/sec'  │ '± 2.92 %' │ '-'                     │
│  'undici - fetch'      │ 75      │ '5438.50 req/sec'  │ '± 2.97 %' │ '+ 15.42 %'             │
│  'axios'               │ 45      │ '5448.08 req/sec'  │ '± 2.98 %' │ '+ 15.62 %'             │
│  'request'             │ 65      │ '5809.63 req/sec'  │ '± 2.90 %' │ '+ 23.30 %'             │
│  'http - no keepalive' │ 35      │ '5910.77 req/sec'  │ '± 2.87 %' │ '+ 25.44 %'             │
│  'got'                 │ 50      │ '6047.80 req/sec'  │ '± 2.91 %' │ '+ 28.35 %'             │
│  'superagent'          │ 60      │ '7534.53 req/sec'  │ '± 2.97 %' │ '+ 59.91 %'             │
│  'http - keepalive'    │ 75      │ '9343.41 req/sec'  │ '± 2.90 %' │ '+ 98.30 %'             │
│  'undici - pipeline'   │ 65      │ '13470.70 req/sec' │ '± 2.93 %' │ '+ 185.89 %'            │
│  'undici - request'    │ 80      │ '16850.87 req/sec' │ '± 2.93 %' │ '+ 257.63 %'            │
│  'undici - stream'     │ 101     │ '18488.56 req/sec' │ '± 3.81 %' │ '+ 292.38 %'            │
│  'undici - dispatch'   │ 101     │ '20786.44 req/sec' │ '± 3.08 %' │ '+ 341.15 %'            │
└────────────────────────┴─────────┴────────────────────┴────────────┴─────────────────────────┘

HTTP/1.1 over HTTPS

┌────────────────────────┬─────────┬───────────────────┬────────────┬─────────────────────────┐
│  Tests                 │ Samples │ Result            │ Tolerance  │ Difference with slowest │
├────────────────────────┼─────────┼───────────────────┼────────────┼─────────────────────────┤
│  'https - no keepalive'│ 10      │ '1358.40 req/sec' │ '± 1.99 %' │ '-'                     │
│  'undici - fetch'      │ 30      │ '3721.76 req/sec' │ '± 2.97 %' │ '+ 173.98 %'            │
│  'https - keepalive'   │ 35      │ '5633.91 req/sec' │ '± 2.84 %' │ '+ 314.75 %'            │
│  'undici - pipeline'   │ 15      │ '6254.05 req/sec' │ '± 2.80 %' │ '+ 360.40 %'            │
│  'undici - request'    │ 25      │ '6669.80 req/sec' │ '± 2.73 %' │ '+ 391.01 %'            │
│  'undici - stream'     │ 25      │ '7019.04 req/sec' │ '± 2.77 %' │ '+ 416.71 %'            │
│  'undici - dispatch'   │ 20      │ '7361.85 req/sec' │ '± 2.90 %' │ '+ 441.95 %'            │
└────────────────────────┴─────────┴───────────────────┴────────────┴─────────────────────────┘

HTTP/2

┌────────────────────────┬─────────┬───────────────────┬────────────┬─────────────────────────┐
│  Tests                 │ Samples │ Result            │ Tolerance  │ Difference with slowest │
├────────────────────────┼─────────┼───────────────────┼────────────┼─────────────────────────┤
│  'undici - fetch'      │ 45      │ '3499.03 req/sec' │ '± 2.93 %' │ '-'                     │
│  'native - http2'      │ 25      │ '4904.58 req/sec' │ '± 2.81 %' │ '+ 40.17 %'             │
│  'undici - pipeline'   │ 60      │ '5836.82 req/sec' │ '± 2.99 %' │ '+ 66.81 %'             │
│  'undici - request'    │ 65      │ '6831.25 req/sec' │ '± 2.83 %' │ '+ 95.23 %'             │
│  'undici - stream'     │ 55      │ '6874.30 req/sec' │ '± 2.91 %' │ '+ 96.46 %'             │
│  'undici - dispatch'   │ 55      │ '7791.23 req/sec' │ '± 2.96 %' │ '+ 122.67 %'            │
└────────────────────────┴─────────┴───────────────────┴────────────┴─────────────────────────┘

Docs

  • docs/docs/api/Client.md — clarified pipelining is h1-only and points readers at maxConcurrentStreams for the h2 dispatch ceiling. maxConcurrentStreams now documents its three roles: per-Client h2 dispatch ceiling, the peerMaxConcurrentStreams advertised to the server, and that the user's value is a pre-SETTINGS default.
  • docs/docs/api/H2CClient.md — matched wording on server SETTINGS replacement, noted H2CClient's construction-time aliasing of pipelining to maxConcurrentStreams.
  • docs/docs/api/Pool.md, docs/docs/api/Agent.md — added a note that connections: 1 (or any cap) lets concurrent requests share an h2 session up to maxConcurrentStreams; the default unlimited connections still opens a fresh Client per concurrent request on cold start until h2 multiplexing kicks in on subsequent dispatches.
  • README.md — refreshed bench tables (Node 22.11.0 → Node 24.14.1) and added the new HTTPS and HTTP/2 tables.

Test plan

  • New regression tests (test/http2-pipelining-default.js)
  • All test/http2-*.js (66 tests pass)
  • All test/client*.js (293 pass / 1 skipped)
  • test/pool*.js, test/agent*.js, test/client-pipelining.js (59 pass)
  • npm run lint
  • H1, h1-over-HTTPS, and h2 benches re-run cleanly with undici - dispatch working — no regressions
  • Follow-up: Pool-level lazy-connect strategy so the default unlimited Pool also consolidates onto a single h2 session on cold burst
  • Follow-up: defaultPipelining: Infinity on the h2 context is dead code now and can be removed

Adds a minimal reproduction showing that with allowH2 enabled by default
(8.0.0+), `pipelining` still defaults to 1, which either serializes
concurrent requests on a shared H2 session or forces Agent to open one
TCP socket per concurrent request. Both modes look like an H2 regression
for callers like the AWS SDK.

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@codecov-commenter

codecov-commenter commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.25%. Comparing base (313f4e0) to head (1e154f7).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5362   +/-   ##
=======================================
  Coverage   93.24%   93.25%           
=======================================
  Files         110      110           
  Lines       36719    36738   +19     
=======================================
+ Hits        34238    34259   +21     
+ Misses       2481     2479    -2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Before this change, Client[kPipelining] defaulted to 1 unconditionally, so
getPipelining() returned 1 even after ALPN negotiated h2 — defeating h2
multiplexing on a shared session and serializing concurrent requests behind
a single in-flight stream.

Leave kPipelining null when the user did not specify it, so getPipelining()
falls through to kHTTPContext.defaultPipelining (1 for h1, Infinity for h2)
once the connection is established. This matches what H2CClient already
does at construction time.

The h1 parser/writer paths that previously read client[kPipelining] as
truthy now use `?? 1` to preserve h1 keep-alive defaults.

The Client#pipelining getter now returns the effective pipelining via
getPipelining() so it reflects the protocol-aware value.

Also makes the http2-abort.js fixture defensive against the peer
RST_STREAM-ing before the server responds, which is now reachable
because the concurrent requests truly multiplex.

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina changed the title repro: H2 default pipelining serializes / fans out (#4143) fix: default h2 Client pipelining so multiplexing actually works (#4143) Jun 5, 2026
Per the public docs, `pipelining` is the HTTP/1.1 RFC7230 concurrent-request
factor (default 1), and `maxConcurrentStreams` is the HTTP/2 multiplexing
ceiling (default 100). Conflating them — as the resume() and Client.busy
gates did — meant a single h2 Client was capped at one in-flight stream by
the h1 pipelining factor.

Introduce a protocol-aware helper `getMaxConcurrent(client)` that returns
`kMaxConcurrentStreams` when the active context is h2, and falls back to
`getPipelining()` otherwise. Use it in both dispatch gates.

This preserves the public semantics of `Client#pipelining` (still reflects
only the h1 factor, defaults to 1) and leaves h1 keep-alive paths untouched.

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina changed the title fix: default h2 Client pipelining so multiplexing actually works (#4143) fix(h2): route Client dispatch gate through maxConcurrentStreams (#4143) Jun 6, 2026
mcollina added 2 commits June 6, 2026 10:56
- `pipelining` is HTTP/1.1 RFC7230 only; note that it has no effect on h2
  sessions, pointing readers at `maxConcurrentStreams` instead.
- `maxConcurrentStreams` documents its three roles: the per-Client h2
  dispatch ceiling, the `peerMaxConcurrentStreams` advertised to the
  server, and a pre-`SETTINGS` default rather than a hard cap (the server
  value replaces it whenever one is sent).

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
- H2CClient: match Client.md wording on server SETTINGS replacing the
  user-supplied value, surface `peerMaxConcurrentStreams`, and note that
  H2CClient (unlike Client) aliases pipelining to maxConcurrentStreams at
  construction time.
- Pool: add a note that with the default unlimited `connections`, Pool
  opens a new Client (== a new socket) per concurrent dispatch, which
  defeats h2 multiplexing on a shared session — to benefit from it, cap
  `connections`.
- Agent: same note via its per-origin Pool.

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina marked this pull request as ready for review June 6, 2026 09:33
With h2 multiplexing actually working, the #2364 tests can finish their
`t.completed` before the server-side setTimeout(400) callback fires.
When `after()` then closes the client, the server's h2 session reads a
RST/FIN that lands as `ECONNRESET` — an asynchronous error landing after
the test ended, which node:test escalates to an `uncaughtException`.

Attach an `'error'` handler on each server-side session to swallow that
post-test ECONNRESET. The streams themselves are still guarded against
ERR_HTTP2_INVALID_STREAM.

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
mcollina added 3 commits June 6, 2026 12:10
The session.on('error') alone wasn't enough on macOS/Windows. Also
suppress 'error' on the underlying TLS socket (via secureConnection),
on session.socket, and on the http2 server itself, since the post-test
ECONNRESET can land on any of those depending on platform timing.

Signed-off-by: Matteo Collina <hello@matteocollina.com>
Closing the client inside the test body (instead of in `after()`) keeps
the resulting session teardown and any ECONNRESET inside the test's
lifetime, where the error handlers can swallow it and node:test no
longer sees "async activity after the test ended" on macOS/Windows.

For the 2nd variant, also wait 500ms after closing so the server-side
400ms timer fires (it no-ops on a destroyed stream) before the test
function returns.

The existing `after(() => client.close())` hooks are guarded against
ClientDestroyedError now that the body closes the client itself.

Signed-off-by: Matteo Collina <hello@matteocollina.com>
Pool's GetDispatcher picks a Client with !kNeedDrain, otherwise spawns a
new one. With the default allowH2 (https + ALPN h2), the first request
opens a Client and the resulting kPending > 0 on that Client made
Client[kBusy] return true — so a burst of concurrent requests would
spawn one Client (and one h2 session, one TCP socket, one TLS handshake)
per request *during the handshake*, even though all of them could
multiplex on a single session once h2 negotiated.

The Client now treats h2 as "may negotiate" while the URL is https and
allowH2 is on, even before the HTTP context attaches. In that state:

- kBusy ignores the per-request kPending > 0 fan-out signal
- getMaxConcurrent returns maxConcurrentStreams instead of pipelining

so concurrent dispatches share the first Client. Once the context
attaches, h2 keeps multiplexing; h1 contracts back to the pipelining
ceiling and the queue drains accordingly. Plain http and explicit
allowH2: false are untouched.

Regression test asserts a default Pool consolidates N concurrent
requests onto one h2 session / one socket.

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina changed the title fix(h2): route Client dispatch gate through maxConcurrentStreams (#4143) fix(h2): make Client and Pool actually multiplex on h2 (#4143) Jun 6, 2026
mcollina added 4 commits June 6, 2026 13:46
Re-ran benchmarks/benchmark.js (50 connections, pipelining=10) on Node
24.14.1. Relative ordering of dispatchers is essentially unchanged.

`undici - dispatch` is omitted because its SimpleRequest handler still
uses the legacy onConnect/onHeaders/onData/onComplete API and trips the
current assertRequestHandler validator on main; the bench files need a
separate refresh.

Signed-off-by: Matteo Collina <hello@matteocollina.com>
`SimpleRequest` in all four bench files used the legacy
onConnect/onHeaders/onData/onComplete/onError handler API and tripped
assertRequestHandler, so `undici - dispatch` reported `Errored` on every
recent run. Migrated to the current onRequestStart/onResponseStart/
onResponseData/onResponseEnd/onResponseError API and re-ran the h1 and
h2 benches on Node 24.14.1.

Updated README:
- H1 table re-captured with the working dispatch row (now leads at
  ~20.4k req/sec as expected).
- New H2 table added for benchmarks/benchmark-http2.js (same shape:
  50 connections, pipelining depth 10).

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
The earlier 'may negotiate h2' speculation queued every concurrent
dispatch onto the first connecting Client. When ALPN turned out to be
h1 (e.g. plain https against an h1-only server), those requests then
serialized through `pipelining` — a ~10x throughput regression on the
https benchmark.

Drop the speculation. `Client.busy` and `getMaxConcurrent` now consider
multiplexing only when an h2 context is actually attached. Single-Client
h2 multiplexing still works (the queued requests drain in one batch
after the context attaches) and `Pool({ connections: 1 })` consolidates
onto one h2 session. The cold-burst unconstrained-Pool case (no
`connections` cap) still fans out one socket per request during the
TLS handshake; resolving that needs a Pool-level lazy-connect
strategy and is left for a follow-up.

Test reshaped accordingly: the unconstrained-Pool assertion is replaced
by a `connections: 1` test, with an inline comment pointing at the
follow-up.

Also adds `benchmarks` package.json scripts for the h1-over-https bench
so the regression test path is reproducible (`npm run bench:https`).

Signed-off-by: Matteo Collina <matteo.collina@gmail.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
Captured all three on Node 24.14.1, 50 connections, pipelining depth 10.

- HTTP/1.1: re-captured on the same setup. Numbers within run-to-run
  variance of the prior table.
- HTTP/1.1 over HTTPS: new table, exercising the h1-over-TLS path that
  the speculation revert is meant to keep fast.
- HTTP/2: re-captured.

Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina changed the title fix(h2): make Client and Pool actually multiplex on h2 (#4143) fix(h2): make Client multiplex on h2 (#4143) Jun 6, 2026
@mcollina mcollina merged commit 0acef15 into main Jun 6, 2026
66 of 67 checks passed
@mcollina mcollina deleted the repro/h2-pipelining-default-issue-4143 branch June 6, 2026 15:58
@github-actions github-actions Bot mentioned this pull request Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants