Skip to content

fix: add request timeouts and mask API key in repr#1262

Open
planetf1 wants to merge 1 commit into
generative-computing:mainfrom
planetf1:fix/security-hardening
Open

fix: add request timeouts and mask API key in repr#1262
planetf1 wants to merge 1 commit into
generative-computing:mainfrom
planetf1:fix/security-hardening

Conversation

@planetf1

@planetf1 planetf1 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Add explicit timeouts to two previously unbounded `requests` calls, and
mask the API key in `OpenAIBackend.repr` / `str`.

Timeouts

`RESTHandler.emit()` uses `timeout=2`: this handler fires synchronously
on every log event, so a slow or unreachable webhook endpoint would block
the calling thread on every log write. 2 s is enough to confirm liveness;
beyond that the record should be dropped.

`is_vllm_server_with_structured_output()` uses `timeout=10`: this probe
runs once at `OpenAIBackend.init()` time, not on every request, so a
longer timeout costs nothing in throughput. vLLM's `/version` endpoint is
trivial (no GPU path), but cloud or VPN round-trips can be slow; 10 s covers
realistic slow-network cases while still failing fast on a dead server. A
false timeout here silently degrades structured output behaviour, so erring
toward patience is correct.

API key masking

`repr` and `str` previously inherited the default object repr,
which could expose the API key in logs, exception messages, or debug output.
Both now return `***` in place of the key.

Identified in the 2026-05-12 security audit.

Closes #1246

@github-actions github-actions Bot added the bug Something isn't working label Jun 12, 2026
@planetf1 planetf1 marked this pull request as ready for review June 12, 2026 11:39
@planetf1 planetf1 requested review from a team, jakelorocco and nrfulton as code owners June 12, 2026 11:39
@planetf1 planetf1 requested a review from AngeloDanducci June 12, 2026 11:39
Add explicit timeouts to two previously unbounded requests calls, and
mask the API key in OpenAIBackend.__repr__ / __str__.

Timeouts
--------
RESTHandler.emit() uses timeout=2: this handler fires synchronously on
every log event, so a slow or unreachable webhook endpoint would block
the calling thread on every log write. 2 s is enough to confirm
liveness; beyond that the record should be dropped.

is_vllm_server_with_structured_output() uses timeout=10: this probe
runs once at OpenAIBackend.__init__() time, not on every request, so
a longer timeout costs nothing in throughput. vLLM's /version endpoint
is trivial (no GPU path), but cloud or VPN round-trips can be slow;
10 s covers realistic slow-network cases while still failing fast on a
dead server. A false timeout here silently degrades structured output
behaviour, so erring toward patience is correct.

API key masking
---------------
__repr__ and __str__ previously inherited the default object repr,
which could expose the API key in logs, exception messages, or debug
output. Both now return *** in place of the key.

Identified in the 2026-05-12 security audit.

Closes generative-computing#1246

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
@planetf1 planetf1 force-pushed the fix/security-hardening branch from e8ed118 to 465f56f Compare June 12, 2026 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security hardening: webhook validation, adapter path traversal, API key hygiene, and input validation

1 participant