Skip to content

fix: add TCP keepalive to default httpx transport to prevent NAT hangs#3368

Open
Oxygen56 wants to merge 1 commit into
openai:mainfrom
Oxygen56:fix/tcp-keepalive-nat-hang-3269
Open

fix: add TCP keepalive to default httpx transport to prevent NAT hangs#3368
Oxygen56 wants to merge 1 commit into
openai:mainfrom
Oxygen56:fix/tcp-keepalive-nat-hang-3269

Conversation

@Oxygen56
Copy link
Copy Markdown

@Oxygen56 Oxygen56 commented Jun 5, 2026

Fixes #3269 — Non-streaming calls silently hang forever behind NAT.

httpx's default transport does not enable TCP keepalive, causing non-streaming requests to hang silently behind NAT when idle connections are dropped by the NAT gateway.

This adds socket-level TCP keepalive options to the default HTTP client transport:

  • SO_KEEPALIVE — enable keepalive probes
  • TCP_KEEPIDLE (Linux) / TCP_KEEPALIVE (macOS) — 60s idle before first probe
  • TCP_KEEPINTVL — 60s between subsequent probes
  • TCP_KEEPCNT — 5 unacknowledged probes before declaring dead

Uses kwargs.setdefault so any caller-supplied custom transport is completely unaffected.

This is the same approach used by the Anthropic Python SDK.

httpx's default transport does not enable TCP keepalive, causing
non-streaming requests to hang silently behind NAT when idle
connections are dropped by the NAT gateway.

This adds socket-level TCP keepalive options (SO_KEEPALIVE, TCP_KEEPIDLE,
TCP_KEEPINTVL, TCP_KEEPCNT) to the default HTTP client transport.

Fixes openai#3269

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Oxygen56 Oxygen56 requested a review from a team as a code owner June 5, 2026 15:52
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca02bf25f7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

kwargs.setdefault("timeout", DEFAULT_TIMEOUT)
kwargs.setdefault("limits", DEFAULT_CONNECTION_LIMITS)
kwargs.setdefault("follow_redirects", True)
kwargs.setdefault("transport", httpx.HTTPTransport(socket_options=_build_keepalive_socket_options()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Raise the httpx floor before using socket_options

This package still declares httpx>=0.23.0, <1 in pyproject.toml, but socket_options was only added to HTTPTransport/AsyncHTTPTransport in httpx 0.25.0. In environments that satisfy the current dependency with httpx 0.23.x or 0.24.x, constructing the default OpenAI client will raise TypeError: ... unexpected keyword argument 'socket_options' before any request is made. Please either guard this argument for older httpx versions or bump the minimum dependency.

Useful? React with 👍 / 👎.

kwargs.setdefault("timeout", DEFAULT_TIMEOUT)
kwargs.setdefault("limits", DEFAULT_CONNECTION_LIMITS)
kwargs.setdefault("follow_redirects", True)
kwargs.setdefault("transport", httpx.HTTPTransport(socket_options=_build_keepalive_socket_options()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve httpx client options when adding keepalive

Creating and passing a concrete transport here causes httpx.Client to use that transport as-is, so the limits, verify, cert, trust_env, http1, and http2 kwargs set on DefaultHttpxClient(...) are no longer applied to the default non-proxy transport. This regresses both the SDK's DEFAULT_CONNECTION_LIMITS and documented customization such as DefaultHttpxClient(http2=True) unless callers manually build their own transport. Please build the transport with the effective kwargs or use an approach that does not bypass httpx's transport initialization.

Useful? React with 👍 / 👎.

kwargs.setdefault("timeout", DEFAULT_TIMEOUT)
kwargs.setdefault("limits", DEFAULT_CONNECTION_LIMITS)
kwargs.setdefault("follow_redirects", True)
kwargs.setdefault("transport", httpx.AsyncHTTPTransport(socket_options=_build_keepalive_socket_options()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve async httpx client options when adding keepalive

The async default client has the same issue: supplying a prebuilt AsyncHTTPTransport means httpx.AsyncClient does not apply its limits, TLS, env, or HTTP/2 options to the default transport. Users relying on DefaultAsyncHttpxClient(http2=True) or the SDK's default connection limits silently get httpx transport defaults instead. Please pass the effective options into the transport or avoid replacing httpx's normal transport construction.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Non-streaming calls silently hang forever behind NAT — default httpx transport has no TCP keepalive

1 participant