Skip to content

fix(openai-agents): capture response.instructions as system prompt in generation spans#4131

Merged
dvirski merged 2 commits into
mainfrom
dr/fix(openai-agents)-capture-response.instructions-as-system-prompt-in-generation-spans
May 18, 2026
Merged

fix(openai-agents): capture response.instructions as system prompt in generation spans#4131
dvirski merged 2 commits into
mainfrom
dr/fix(openai-agents)-capture-response.instructions-as-system-prompt-in-generation-spans

Conversation

@dvirski
Copy link
Copy Markdown
Contributor

@dvirski dvirski commented May 12, 2026

Fixes #3738.

Problem:
When an OpenAI Agent had instructions set, the system prompt was silently dropped from generation spans — only the conversation history was recorded, with no role: system message.

Fix:
Prepend response.instructions as a role: system message to the input messages array in _end_generation_span() before passing to _extract_prompt_attributes(). Consistent with how responses_wrappers.py handles it in the plain openai instrumentation.

Summary by CodeRabbit

  • Bug Fixes

    • Tracing now includes agent instructions as a leading system message in traced input when content tracing is enabled, ensuring prompt context is captured correctly.
  • Tests

    • Updated assertions to reflect new message ordering and role-based lookup.
    • Added tests covering instruction handling: enabled/disabled tracing, empty input, and empty or missing instruction cases.

Review Change Stack

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 12, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0a7da33-0c62-4e57-a61f-ad597ada528b

📥 Commits

Reviewing files that changed from the base of the PR and between 465f302 and f2d0b22.

📒 Files selected for processing (3)
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_tracing_processor.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py

📝 Walkthrough

Walkthrough

Prepend agent response.instructions as a leading system message to traced prompt input when trace_content is enabled; tests updated to assert the new message ordering and handle edge cases.

Changes

Capture response.instructions as system message

Layer / File(s) Summary
System message injection in _end_generation_span
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
_end_generation_span now reads response before extracting prompt attributes and, when trace_content is true and response.instructions is truthy, prepends a system message with those instructions to input_data before calling _extract_prompt_attributes.
Unit tests for response.instructions handling
packages/opentelemetry-instrumentation-openai-agents/tests/test_tracing_processor.py
Added tests covering: instructions are prepended as a system message when trace_content=True; instructions are omitted when trace_content=False; empty input yields only the system instruction message; falsy/empty instructions are skipped.
Integration test assertions for message ordering
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
Updated test_agent_spans to expect the system message at index 0 and to locate the user message by role == "user", reflecting the injected system message ordering.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • galzilber
  • nina-kollman
  • max-deygin-traceloop
  • netanel-tl
  • doronkopit5

Poem

🐰 A system message hops into place,
First in the trace with gentle grace,
Instructions lead, then questions play,
Traces now capture what agents say,
Hooray—no instruction goes astray!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: capturing response.instructions as a system prompt in generation spans for OpenAI Agents instrumentation.
Linked Issues check ✅ Passed The PR implementation satisfies all requirements from issue #3738: captures response.instructions as system prompt, prepends it as first message, handles edge cases (empty input, falsy instructions), and includes defensive normalization for input_data type.
Out of Scope Changes check ✅ Passed All changes are directly within scope of the linked issue: hooks implementation to capture response.instructions, test updates for behavior verification, and edge case tests requested by reviewers.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dr/fix(openai-agents)-capture-response.instructions-as-system-prompt-in-generation-spans

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dvirski dvirski changed the title fix(openai-agents): capture response.instructions as system prompt in fix(openai-agents): capture response.instructions as system prompt in generation spans May 12, 2026
@doronkopit5
Copy link
Copy Markdown
Member

doronkopit5 commented May 12, 2026

Two small suggestions before merge:

  1. Defensive normalization of input_data — at _hooks.py:946, if input_data is ever a str (the Responses API allows input: str), list + str will TypeError. Today the agents SDK passes a list, but a cheap guard avoids a future footgun:

    existing = input_data if isinstance(input_data, list) else []
    input_data = [{"role": "system", "content": response.instructions}] + existing

    This also keeps the line under the lint length limit.

  2. Two tiny edge-case tests worth adding alongside the existing ones:

    • input_data == [] with only instructions → asserts single system-message output.
    • instructions == "" → asserts the empty/falsy case is skipped (locks in the current getattr(..., None) truthy check).

Otherwise LGTM 👍

@dvirski dvirski force-pushed the dr/fix(openai-agents)-capture-response.instructions-as-system-prompt-in-generation-spans branch from c2d4b1d to 465f302 Compare May 18, 2026 07:18
@dvirski
Copy link
Copy Markdown
Contributor Author

dvirski commented May 18, 2026

@doronkopit5

Fixed both:

  • Defensive normalization of input_data — at _hooks.py:946
  • Two tiny edge-case tests worth adding alongside the existing ones.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-openai-agents/tests/test_tracing_processor.py`:
- Around line 755-884: Add a new unit test in tests/test_tracing_processor.py
that calls processor._end_generation_span with span_data.input set to a string
(e.g., "Hello") and response.instructions set to a non-empty string, calling
trace_content=True; verify that GenAIAttributes.GEN_AI_INPUT_MESSAGES is present
on the otel_span, parse the JSON, and assert the resulting messages array has
the system message (role "system" with the instruction content) followed by the
normalized user message, thereby locking in the defensive normalization behavior
in _end_generation_span.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 07b575a5-edff-46a4-a90c-1517725567ba

📥 Commits

Reviewing files that changed from the base of the PR and between c2d4b1d and 465f302.

📒 Files selected for processing (3)
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_tracing_processor.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
  • packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_hooks.py

@dvirski dvirski force-pushed the dr/fix(openai-agents)-capture-response.instructions-as-system-prompt-in-generation-spans branch from 465f302 to f2d0b22 Compare May 18, 2026 14:25
@dvirski dvirski merged commit 709c825 into main May 18, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: OpenAI Agents instrumentor does not capture response.instructions (system prompt)

3 participants