fix tracing for operation id to flow in exception table.#47387
fix tracing for operation id to flow in exception table.#47387harsheet-shah wants to merge 16 commits into
Conversation
|
Thank you for your contribution @harsheet-shah! We will review the pull request and get back to you soon. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Updates ASGI tracing middleware to ensure logs emitted after request handling (notably Hypercorn access logs) still carry the correct OpenTelemetry trace context, preventing zeroed operation_Id in Application Insights.
Changes:
- Creates a
NonRecordingSpanfrom extracted trace context soget_current_span()exposes the correcttrace_id. - Attaches the extracted context for the duration of the task and removes the previous
detach()infinallyto keep context available for post-response logging.
| span = trace.get_current_span(ctx) | ||
| span_ctx = span.get_span_context() | ||
| if span_ctx and span_ctx.trace_id: | ||
| non_recording = trace.NonRecordingSpan(span_ctx) |
There was a problem hiding this comment.
Why do we need to create a non-recording span ?
There was a problem hiding this comment.
We should validate consequent request and how traces show up.
There was a problem hiding this comment.
yes @singankit I've validated pls check this query - https://ms.portal.azure.com#@72f988bf-86f1-41af-91ab-2d7cd011db47/blade/Microsoft_OperationsManagementSuite_Workspace/Logs.ReactView/resourceId/%2Fsubscriptions%2F47f1c914-e299-4953-a99d-3e34644cfe1c%2FresourceGroups%2Fharsheetrg%2Fproviders%2Fmicrosoft.insights%2Fcomponents%2Fharsheet-app-insights-debug/source/LogsBlade.AnalyticsShareLinkToQuery/q/H4sIAAAAAAAAA0utSE4tKMnMzyvmqlEoz0gtSlUoycxNLS5JzC1QsLNVKMlPSSxJBQlpqBsZGJnpGoBQiKGplZGJlYGBnoWZqZGxRZS6JlB7cX5RiUJSJZIBicXJAH4x9HdhAAAA/limit/1000
Defer context detach by one event-loop turn so post-response access logging can still observe request trace context, while still restoring prior context to prevent leakage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| await asyncio.sleep(0) | ||
| detach_mock.assert_called_once_with("token") | ||
|
|
||
| asyncio.run(run_test()) |
| await asyncio.sleep(0) | ||
| detach_mock.assert_called_once_with("token") | ||
|
|
||
| asyncio.run(run_test()) |
| with mock.patch("azure.ai.agentserver.core._tracing._otel_context.attach", return_value="token") as attach_mock, \ | ||
| mock.patch("azure.ai.agentserver.core._tracing._otel_context.detach") as detach_mock: | ||
| await middleware(scope, receive, send) | ||
| assert events == ["app-called"] | ||
| attach_mock.assert_called_once() | ||
| detach_mock.assert_not_called() | ||
|
|
||
| await asyncio.sleep(0) | ||
| detach_mock.assert_called_once_with("token") |
Validate detach_context swallows ValueError so call_soon(detach_context, token) cannot surface non-current-token detach errors as unhandled callback failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This reverts commit 66ae7a2.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use a dedicated safe callback for call_soon detaches and keep ValueError suppression behavior for invalid/non-current tokens without regressing deferred-detach tracing fix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use call_soon(detach_context, token) directly and keep ValueError handling in detach_context to avoid callback noise while preserving deferred trace context behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use an explicit deferred callback helper that detaches the attached token and swallows ValueError to avoid unhandled callback noise while preserving delayed context cleanup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wrap deferred detach scheduling in try/except and fall back to immediate best-effort detach so token cleanup still happens when the event loop is closing/closed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Route both call_soon callback and scheduling-failure fallback through detach_context(token) to keep detach semantics centralized and consistent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add agent name/version and sessionid dimensions to log records via OTel log processor while retaining baggage propagation and span enrichment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolve agent name/version from AGENT_* env pairs when FOUNDRY_AGENT_* is absent, and remove gen_ai.agent.sessionid to rely on existing azure.ai.agentserver.session_id baggage field. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines