Skip to content

.NET: Python: Information-flow control based prompt injection defense#5024

Open
shrutitople wants to merge 24 commits intomicrosoft:mainfrom
shrutitople:ifc-pia-defense
Open

.NET: Python: Information-flow control based prompt injection defense#5024
shrutitople wants to merge 24 commits intomicrosoft:mainfrom
shrutitople:ifc-pia-defense

Conversation

@shrutitople
Copy link
Copy Markdown

@shrutitople shrutitople commented Apr 1, 2026

Motivation and Context

LLM agents are vulnerable to prompt injection attacks — malicious instructions in external content (tool results, API responses) that cause data exfiltration or unauthorized actions.
This PR introduces FIDES, a deterministic defense based on information flow control (IFC). Instead of detecting injections, it tracks content provenance via labels and enforces policies — untrusted content can't influence trusted operations, private data can't leak to public channels.

Description

Security Primitives, Middleware & Tools — _security.py (single consolidated module)

  • Labels: IntegrityLabel (trusted/untrusted) × ConfidentialityLabel (public/private/user_identity)
  • Lattice combination: most-restrictive-wins via combine_labels()
  • Variable indirection: ContentVariableStore replaces untrusted content with opaque VariableReferenceContent placeholders — the LLM never sees raw untrusted data
  • LabelTrackingFunctionMiddleware — 3-tier automatic label propagation:
    1. Per-item embedded labels (additional_properties.security_label)
    2. Tool-level source_integrity declaration
    3. Join of input argument labels (fallback)
  • PolicyEnforcementFunctionMiddleware — blocks or requests approval when context confidentiality exceeds a tool's max_allowed_confidentiality
  • SecureAgentConfig — one-line setup wiring middleware, tools, and instructions
  • quarantined_llm — isolated LLM call (no tools) for safe summarization of untrusted content
  • inspect_variable — controlled access to hidden variables with label awareness
  • All results use list[Content] (aligned with upstream FunctionTool.invoke())

Framework Integration — _tools.py, DevUI

  • FunctionApprovalRequest content type for human-in-the-loop policy enforcement
  • DevUI maps approval requests to interactive approve/reject UI

Tests — test_security.py

  • 115 unit tests covering label propagation, variable indirection, policy enforcement, quarantine, 3-tier labeling, and edge cases

Samples — python/samples/02-agents/security/

Sample Demonstrates
email_security_example.py Integrity-based defense against injection in email content
repo_confidentiality_example.py Confidentiality-based data exfiltration prevention
github_mcp_labels_example.py Integration with GitHub MCP server labels

Documentation

  • FIDES_DEVELOPER_GUIDE.md (in python/samples/02-agents/security/), python/samples/02-agents/security/README.md, docs/features/FIDES_IMPLEMENTATION_SUMMARY.md

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible (115 new tests)
  • Is this a breaking change? No — all changes are additive; security middleware is opt-in via SecureAgentConfig

Copilot AI review requested due to automatic review settings April 1, 2026 10:00
@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Apr 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces FIDES, an information-flow control (IFC) security layer for the agent framework to deterministically mitigate prompt injection and data exfiltration via integrity/confidentiality labels, variable indirection, and policy enforcement.

Changes:

  • Add core security primitives (labels, variable store, lineage) plus security middleware for label propagation and policy enforcement.
  • Add security tools (quarantined_llm, inspect_variable) and DevUI support for displaying/handling policy-violation approval requests.
  • Add new security samples and extensive documentation/ADRs describing FIDES usage and design.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
python/samples/getting_started/security/repo_confidentiality_example.py Sample demonstrating confidentiality-based exfiltration prevention.
python/samples/getting_started/security/github_mcp_labels_example.py Sample demonstrating parsing GitHub MCP label metadata and enforcing policies.
python/samples/getting_started/security/email_security_example.py Sample demonstrating integrity-based prompt injection defense + quarantine processing.
python/samples/getting_started/security/init.py Marks security samples as a package.
python/packages/devui/agent_framework_devui/_mapper.py Adds policy-violation details to approval request events sent to the UI.
python/packages/devui/agent_framework_devui/_executor.py Propagates policy-violation metadata through approval responses.
python/packages/core/agent_framework/_tools.py Adds policy-approval plumbing and placeholder replacement around tool approval flows.
python/packages/core/agent_framework/_security_tools.py Implements quarantine/inspection tools and tool-use instructions for hidden content.
python/packages/core/agent_framework/_security_middleware.py Implements label tracking, variable hiding, and policy enforcement middleware.
python/packages/core/agent_framework/_security.py Adds label types, label combination, variable store, and lineage/message labeling primitives.
python/packages/core/agent_framework/init.py Exposes security APIs and adds ai_function alias.
docs/decisions/0011-prompt-injection-defense.md ADR describing the FIDES design and rationale.
QUICK_START_FIDES.md Quick-start guide for configuring and using FIDES.
FIDES_IMPLEMENTATION_SUMMARY.md High-level implementation summary of FIDES components and deliverables.
FIDES_DEVELOPER_GUIDE.md Full developer guide for FIDES concepts, APIs, best practices, and examples.

@github-actions github-actions bot changed the title Information-flow control based prompt injection defense Python: Information-flow control based prompt injection defense Apr 1, 2026
Copy link
Copy Markdown
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code overall is looking good, I would like to see all _security*.py files folded into the _security.py file, that is more in line with the rest of the repo.

# non-declaration-only functions.

tool: FunctionTool | None = None
tool: AIFunction[BaseModel, Any] | None = None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is old syntax

)


class LabeledMessage:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this inherit from _types.Message?



@runtime_checkable
class QuarantineChatClientProtocol(Protocol):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove this, we can use SupportsChatGetResponse, looks like this was already replaced, so just removing is enough

Examples:
.. code-block:: python

from agent_framework.azure import AzureOpenAIChatClient
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't exist anymore

"""


class QuarantinedLLMInput(BaseModel):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used? The @tool decorator already auto creates the schema for that tool, so no need to do that manually

print("Query to try: 'Please fetch my recent emails and give me a brief summary of each one.'")
print()

# Launch debug UI
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Launch debug UI
# Launch DevUI

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this into the samples folder

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the samples structure was also updated, this should be in 02-agent/security

@moonbox3 moonbox3 added .NET workflows Related to Workflows in agent-framework labels Apr 13, 2026
@github-actions github-actions bot changed the title Python: Information-flow control based prompt injection defense .NET: Python: Information-flow control based prompt injection defense Apr 13, 2026
@moonbox3
Copy link
Copy Markdown
Contributor

moonbox3 commented Apr 13, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _security.py69518673%414–415, 433, 436–437, 510–511, 513, 561, 572, 587, 597, 664–665, 668, 672, 679–680, 682, 689–692, 694, 696–703, 705, 708–711, 713–715, 717–718, 721–722, 724–725, 728–729, 731–735, 738–742, 744, 992, 998–999, 1005–1006, 1036–1037, 1063–1071, 1092, 1195–1196, 1227–1228, 1253–1254, 1258–1259, 1293–1296, 1378–1379, 1384–1389, 1394–1396, 1427, 1636, 1641–1642, 1646, 1650–1652, 1691, 1695–1696, 1698, 1702, 1706, 1709, 1712, 1722, 1729, 1739–1740, 1755, 1784, 1788–1790, 1793, 1797, 1800, 1803, 1813, 1818, 1828–1829, 1867, 1908–1909, 1934, 2057–2059, 2094–2096, 2104, 2112, 2358–2359, 2361, 2363–2365, 2368, 2380, 2386–2387, 2389, 2420, 2424–2427, 2429, 2566–2569, 2572–2573, 2575, 2577, 2579, 2582–2586, 2593, 2597, 2608–2609, 2611, 2613–2615, 2676, 2686–2687
   _tools.py9799490%190–191, 364, 366, 379, 404–406, 414, 432, 446, 453, 460, 483, 485, 492, 500, 539, 583, 587, 619–621, 623, 629, 674–676, 678, 701, 727, 731, 769–771, 775, 797, 909–915, 951, 963, 965, 967, 970–973, 994, 998, 1002, 1016–1018, 1362, 1372, 1391, 1456, 1476, 1493–1499, 1628, 1632, 1678, 1739–1740, 1852, 1877–1878, 1899, 1919, 1921, 1977, 2040, 2212–2213, 2233, 2289–2290, 2350, 2428–2429, 2496, 2501, 2508
TOTAL28271339088% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5678 20 💤 0 ❌ 0 🔥 1m 30s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation .NET python workflows Related to Workflows in agent-framework

Projects

Status: Community PR

Development

Successfully merging this pull request may close these issues.

5 participants