Skip to content

feat(openai-temporal): render hosted/server-side tool calls in TemporalStreamingModel#442

Merged
declan-scale merged 2 commits into
nextfrom
declan-scale/oai-temporal-hosted-tools
Jun 24, 2026
Merged

feat(openai-temporal): render hosted/server-side tool calls in TemporalStreamingModel#442
declan-scale merged 2 commits into
nextfrom
declan-scale/oai-temporal-hosted-tools

Conversation

@declan-scale

@declan-scale declan-scale commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Stacked on #436. Renders Responses-API hosted/server-side tool calls (web_search, file_search, code_interpreter, image generation, server-side mcp, computer_call, local_shell_call) as ToolRequestContent + ToolResponseContent pairs in TemporalStreamingModel.

These tools execute inside the Responses API, so they never become function_call items and the SDK RunHooks (on_tool_start/on_tool_end) never fire for them. Previously they produced no UI event at all. Each completed hosted-tool output item is now surfaced as a request/response pair, mirroring how function_call tools already render.

This upstreams the hosted-tool half of golden_agent's vendored RichToolStreamingModel so that agent can stop vendoring (its "Phase 4"). The reasoning double-render fix the vendored model also carried already landed upstream (a5335983, fixed at the streaming-layer root cause), so only hosted-tool handling is ported here.

Changes

  • New module helpers _HOSTED_TOOL_TYPES, _coerce_args, _hosted_tool_request, _hosted_tool_result.
  • New _post_tool_message method (one-shot full message, no deltas).
  • New elif item.type in _HOSTED_TOOL_TYPES branch in the ResponseOutputItemDoneEvent handler.
  • Purely additive: existing function_call, reasoning, and text handling are unchanged.

Test plan

  • 8 new unit tests (tests/test_hosted_tools.py) covering the extraction helpers.
  • Existing openai_agents plugin suite (45 tests) stays green; ruff clean.
  • Live model-agnostic turn exercising a hosted tool (e.g. web_search / server-side mcp) renders a request+response pair in the UI.

Note: base is declan-scale/init-templates-codex (#436 head) — review/merge #436 first.

🤖 Generated with Claude Code

Greptile Summary

This PR adds rendering of Responses-API hosted/server-side tool calls (web_search, file_search, code_interpreter, image_generation, mcp, computer_call, local_shell_call) as ToolRequestContent+ToolResponseContent pairs in TemporalStreamingModel. Previously, these tools executed silently inside the API with no UI event.

  • New module-level helpers (_HOSTED_TOOL_TYPES, _coerce_args, _hosted_tool_request, _hosted_tool_result) extract call ID, display name, args, and a capped result string from completed hosted-tool SDK items.
  • A new _post_tool_message one-shot method posts both halves of the request/response pair using the same streaming_task_message_context START→FULL pattern used by the existing function_call handler.
  • An elif branch in the ResponseOutputItemDoneEvent handler triggers the new path without touching any existing function_call, reasoning, or text handling.

Confidence Score: 5/5

Purely additive change with no modifications to existing function_call, reasoning, or text paths; safe to merge.

All new code follows the established START→FULL streaming pattern used by the function_call handler. The elif branch is correctly scoped, helper functions are well-tested with real SDK types, and the hosted-tool items naturally flow through the existing final response processing via ResponseCompletedEvent. No changes to existing streaming paths.

No files require special attention.

Important Files Changed

Filename Overview
src/agentex/lib/core/temporal/plugins/openai_agents/models/temporal_streaming_model.py Adds ~100 lines of purely additive hosted-tool helpers and one new elif branch in the event loop; existing function_call, reasoning, and text paths are unchanged. Logic is consistent with the established streaming pattern (START→FULL via streaming_task_message_context). No correctness issues found.
src/agentex/lib/core/temporal/plugins/openai_agents/tests/test_hosted_tools.py 8 focused unit tests covering all extraction helpers; key SDK types (ResponseFunctionWebSearch, ResponseComputerToolCall, LocalShellCall, ImageGenerationCall) are now used as real types rather than SimpleNamespace stand-ins, closing the gap flagged in a prior review round.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant API as Responses API
    participant TSM as TemporalStreamingModel
    participant PM as _post_tool_message
    participant SC as StreamingTaskMessageContext
    participant UI as UI / Redis Stream

    API->>TSM: ResponseOutputItemDoneEvent (hosted tool)
    TSM->>TSM: _hosted_tool_request(item) → call_id, name, args
    TSM->>TSM: _hosted_tool_result(item) → result string

    TSM->>PM: _post_tool_message(ToolRequestContent)
    PM->>SC: async with streaming_task_message_context(initial_content)
    SC->>UI: START event (ToolRequestContent)
    PM->>SC: stream_update(StreamTaskMessageFull)
    SC->>UI: FULL event (ToolRequestContent)
    SC-->>PM: __aexit__ → close() early-returns (already closed)

    TSM->>PM: _post_tool_message(ToolResponseContent)
    PM->>SC: async with streaming_task_message_context(initial_content)
    SC->>UI: START event (ToolResponseContent)
    PM->>SC: stream_update(StreamTaskMessageFull)
    SC->>UI: FULL event (ToolResponseContent)
    SC-->>PM: __aexit__ → close() early-returns (already closed)
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant API as Responses API
    participant TSM as TemporalStreamingModel
    participant PM as _post_tool_message
    participant SC as StreamingTaskMessageContext
    participant UI as UI / Redis Stream

    API->>TSM: ResponseOutputItemDoneEvent (hosted tool)
    TSM->>TSM: _hosted_tool_request(item) → call_id, name, args
    TSM->>TSM: _hosted_tool_result(item) → result string

    TSM->>PM: _post_tool_message(ToolRequestContent)
    PM->>SC: async with streaming_task_message_context(initial_content)
    SC->>UI: START event (ToolRequestContent)
    PM->>SC: stream_update(StreamTaskMessageFull)
    SC->>UI: FULL event (ToolRequestContent)
    SC-->>PM: __aexit__ → close() early-returns (already closed)

    TSM->>PM: _post_tool_message(ToolResponseContent)
    PM->>SC: async with streaming_task_message_context(initial_content)
    SC->>UI: START event (ToolResponseContent)
    PM->>SC: stream_update(StreamTaskMessageFull)
    SC->>UI: FULL event (ToolResponseContent)
    SC-->>PM: __aexit__ → close() early-returns (already closed)
Loading

Reviews (4): Last reviewed commit: "fix(openai-temporal): render args for co..." | Re-trigger Greptile

@declan-scale declan-scale force-pushed the declan-scale/init-templates-codex branch from 5d5a0e8 to b5cbe37 Compare June 23, 2026 22:29
@declan-scale declan-scale force-pushed the declan-scale/oai-temporal-hosted-tools branch from ef09c56 to 99895bb Compare June 23, 2026 22:40
@declan-scale declan-scale force-pushed the declan-scale/init-templates-codex branch from b5cbe37 to e089205 Compare June 24, 2026 02:09
Base automatically changed from declan-scale/init-templates-codex to next June 24, 2026 02:15
…alStreamingModel

web_search, file_search, code_interpreter, image generation, and server-side mcp
calls run inside the Responses API: they never become function_call items and the
SDK RunHooks never fire, so the streaming model dropped them entirely (no UI
event). Surface each completed hosted-tool output item as a ToolRequestContent +
ToolResponseContent pair, mirroring how function_call tools render.

This upstreams the hosted-tool half of the golden agent's vendored
RichToolStreamingModel so that agent can stop vendoring. The reasoning
double-render fix it also carried is already handled upstream (a533598), so only
the hosted-tool handling is ported here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@declan-scale declan-scale force-pushed the declan-scale/oai-temporal-hosted-tools branch from 99895bb to ec0df6c Compare June 24, 2026 02:17
…ration hosted tools

Add _hosted_tool_request branches for computer_call and local_shell_call
(both carry an action object) and surface a compact image reference for
image_generation_call results, so these hosted tools show real detail in
the trace instead of empty args. Harden the web_search test to use the
real ResponseFunctionWebSearch/ActionSearch SDK types (the action field
exists on the SDK type) and add coverage for the new branches.

Addresses Greptile review feedback on #442.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@declan-scale declan-scale merged commit 5dce9f0 into next Jun 24, 2026
50 checks passed
@declan-scale declan-scale deleted the declan-scale/oai-temporal-hosted-tools branch June 24, 2026 02:49
@stainless-app stainless-app Bot mentioned this pull request Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant