.NET: Python: Information-flow control based prompt injection defense#5024
.NET: Python: Information-flow control based prompt injection defense#5024shrutitople wants to merge 24 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces FIDES, an information-flow control (IFC) security layer for the agent framework to deterministically mitigate prompt injection and data exfiltration via integrity/confidentiality labels, variable indirection, and policy enforcement.
Changes:
- Add core security primitives (labels, variable store, lineage) plus security middleware for label propagation and policy enforcement.
- Add security tools (
quarantined_llm,inspect_variable) and DevUI support for displaying/handling policy-violation approval requests. - Add new security samples and extensive documentation/ADRs describing FIDES usage and design.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| python/samples/getting_started/security/repo_confidentiality_example.py | Sample demonstrating confidentiality-based exfiltration prevention. |
| python/samples/getting_started/security/github_mcp_labels_example.py | Sample demonstrating parsing GitHub MCP label metadata and enforcing policies. |
| python/samples/getting_started/security/email_security_example.py | Sample demonstrating integrity-based prompt injection defense + quarantine processing. |
| python/samples/getting_started/security/init.py | Marks security samples as a package. |
| python/packages/devui/agent_framework_devui/_mapper.py | Adds policy-violation details to approval request events sent to the UI. |
| python/packages/devui/agent_framework_devui/_executor.py | Propagates policy-violation metadata through approval responses. |
| python/packages/core/agent_framework/_tools.py | Adds policy-approval plumbing and placeholder replacement around tool approval flows. |
| python/packages/core/agent_framework/_security_tools.py | Implements quarantine/inspection tools and tool-use instructions for hidden content. |
| python/packages/core/agent_framework/_security_middleware.py | Implements label tracking, variable hiding, and policy enforcement middleware. |
| python/packages/core/agent_framework/_security.py | Adds label types, label combination, variable store, and lineage/message labeling primitives. |
| python/packages/core/agent_framework/init.py | Exposes security APIs and adds ai_function alias. |
| docs/decisions/0011-prompt-injection-defense.md | ADR describing the FIDES design and rationale. |
| QUICK_START_FIDES.md | Quick-start guide for configuring and using FIDES. |
| FIDES_IMPLEMENTATION_SUMMARY.md | High-level implementation summary of FIDES components and deliverables. |
| FIDES_DEVELOPER_GUIDE.md | Full developer guide for FIDES concepts, APIs, best practices, and examples. |
python/samples/getting_started/security/repo_confidentiality_example.py
Outdated
Show resolved
Hide resolved
python/samples/getting_started/security/github_mcp_labels_example.py
Outdated
Show resolved
Hide resolved
… for ContextProvider rename
bb7f353 to
7ad3872
Compare
…zureOpenAIChatClient
eavanvalkenburg
left a comment
There was a problem hiding this comment.
the code overall is looking good, I would like to see all _security*.py files folded into the _security.py file, that is more in line with the rest of the repo.
| # non-declaration-only functions. | ||
|
|
||
| tool: FunctionTool | None = None | ||
| tool: AIFunction[BaseModel, Any] | None = None |
| ) | ||
|
|
||
|
|
||
| class LabeledMessage: |
There was a problem hiding this comment.
should this inherit from _types.Message?
|
|
||
|
|
||
| @runtime_checkable | ||
| class QuarantineChatClientProtocol(Protocol): |
There was a problem hiding this comment.
let's remove this, we can use SupportsChatGetResponse, looks like this was already replaced, so just removing is enough
| Examples: | ||
| .. code-block:: python | ||
|
|
||
| from agent_framework.azure import AzureOpenAIChatClient |
There was a problem hiding this comment.
this doesn't exist anymore
| """ | ||
|
|
||
|
|
||
| class QuarantinedLLMInput(BaseModel): |
There was a problem hiding this comment.
where is this used? The @tool decorator already auto creates the schema for that tool, so no need to do that manually
| print("Query to try: 'Please fetch my recent emails and give me a brief summary of each one.'") | ||
| print() | ||
|
|
||
| # Launch debug UI |
There was a problem hiding this comment.
| # Launch debug UI | |
| # Launch DevUI |
There was a problem hiding this comment.
let's move this into the samples folder
There was a problem hiding this comment.
the samples structure was also updated, this should be in 02-agent/security
Motivation and Context
LLM agents are vulnerable to prompt injection attacks — malicious instructions in external content (tool results, API responses) that cause data exfiltration or unauthorized actions.
This PR introduces FIDES, a deterministic defense based on information flow control (IFC). Instead of detecting injections, it tracks content provenance via labels and enforces policies — untrusted content can't influence trusted operations, private data can't leak to public channels.
Description
Security Primitives, Middleware & Tools —
_security.py(single consolidated module)IntegrityLabel(trusted/untrusted) ×ConfidentialityLabel(public/private/user_identity)combine_labels()ContentVariableStorereplaces untrusted content with opaqueVariableReferenceContentplaceholders — the LLM never sees raw untrusted dataLabelTrackingFunctionMiddleware— 3-tier automatic label propagation:additional_properties.security_label)source_integritydeclarationPolicyEnforcementFunctionMiddleware— blocks or requests approval when context confidentiality exceeds a tool'smax_allowed_confidentialitySecureAgentConfig— one-line setup wiring middleware, tools, and instructionsquarantined_llm— isolated LLM call (no tools) for safe summarization of untrusted contentinspect_variable— controlled access to hidden variables with label awarenesslist[Content](aligned with upstreamFunctionTool.invoke())Framework Integration —
_tools.py, DevUIFunctionApprovalRequestcontent type for human-in-the-loop policy enforcementTests —
test_security.pySamples —
python/samples/02-agents/security/email_security_example.pyrepo_confidentiality_example.pygithub_mcp_labels_example.pyDocumentation
FIDES_DEVELOPER_GUIDE.md(inpython/samples/02-agents/security/),python/samples/02-agents/security/README.md,docs/features/FIDES_IMPLEMENTATION_SUMMARY.mdContribution Checklist
SecureAgentConfig