Open
Conversation
Updated the sample script to support additional command-line arguments for trace evaluations, including agent ID and trace IDs. Modified the lookback hours default value and improved the overall structure for better clarity.
This sample demonstrates how to run Azure AI Evaluations against a hosted agent using the azure_ai_target_completions data source, evaluating agents live with built-in quality and safety evaluators.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the Azure AI Projects evaluation samples to better support “agent traces as evaluation inputs” scenarios and adds a new sample for evaluating a hosted agent live as the evaluation target.
Changes:
- Enhanced
sample_evaluations_builtin_with_traces.pyto support multiple invocation modes (default App Insights query, server-side--agent-id, and explicit--trace-ids) plus new CLI flags. - Added
sample_evaluations_agent_as_target.pydemonstrating live agent evaluation viaazure_ai_target_completions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| sdk/ai/azure-ai-projects/samples/evaluations/sample_evaluations_builtin_with_traces.py | Adds CLI-driven modes for trace-based evaluations and tweaks defaults/metadata/cleanup behavior. |
| sdk/ai/azure-ai-projects/samples/evaluations/sample_evaluations_agent_as_target.py | New sample showing how to run evaluations where the target is a hosted agent invoked live. |
Agent-Logs-Url: https://github.com/Azure/azure-sdk-for-python/sessions/f15ae794-b773-4e8c-860b-5aea55873600 Co-authored-by: shrutiyer <9905402+shrutiyer@users.noreply.github.com>
Add two new evaluation samples: - sample_evaluations_builtin_with_traces.py: Trace-based evaluation with three modes (client-side App Insights query, server-side agent ID, and explicit trace IDs) - sample_evaluations_agent_as_target.py: Live agent evaluation using azure_ai_target_completions data source Both samples use the azure_ai_evaluator config pattern with builtin intent_resolution and task_adherence evaluators. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add a new agent as target evaluation sample
Enhance the previous sample with an additional scenario for querying agents
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines