feat: add tool requirements policy enforcement system#383
Open
feat: add tool requirements policy enforcement system#383
Conversation
Introduces a PreToolUse hook-based policy system that evaluates tool calls against RFC 2119-style requirements defined in .deepwork/tool_requirements/*.yml. Policies are checked via an HTTP sidecar server (spawned alongside the MCP server) using Haiku for semantic evaluation. Failed checks can be appealed via a new appeal_tool_requirement MCP tool. Approvals are cached with a 1-hour TTL. Key features: - Policy files with tools, match (param regex), requirements, extends (inheritance) - no_exception rules that cannot be appealed - Fail-closed: hook denies if MCP sidecar is unreachable - Loop prevention: appeal tool calls skip the hook - Multi-instance support via PID-keyed + session-keyed port files - Evaluator encapsulated behind ABC for future swap to direct API calls Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- engine.py: rename loop variable `f` to `failure` for clarity - sidecar.py: move `import asyncio` to module level, fix event loop leak with try/finally, fix inaccurate comment, add session_id validation - evaluator.py: change `continue` to `break` on raw JSON array parse, filter non-dict items in _extract_json_array - discovery.py: fix double-name warning message, remove dead code - test_engine.py: add type hints to MockEvaluator.evaluate, remove unused imports - test_tool_requirements_hook.py: remove redundant test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- doc/mcp_interface.md: add appeal_tool_requirement as tool #12, bump count - doc/architecture.md: add tool_requirements/ package and hook to structure - CLAUDE.md: add tool_requirements/ and hook to project structure appendix - src/deepwork/hooks/README.md: add tool_requirements.py to files table - CHANGELOG.md: add tool requirements feature to Unreleased Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- evaluator.py: fix comment accuracy, extract _filter_dicts to reduce DRY - discovery.py: fix diamond inheritance by copying visited set per parent - test_engine.py: remove redundant @pytest.mark.asyncio decorators, fix dict type annotation, replace internal cache access with call_count - test_evaluator.py: add tests for HaikuSubprocessEvaluator, deduplication, non-dict filtering, and invalid bracket JSON Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create DW-REQ-012-tool-requirements.md with 12 sub-requirements covering policy format, discovery, inheritance, matching, evaluation, check flow, appeal, caching, hook, sidecar, multi-instance, and startup - Add PLUG-REQ-001.15 for the PreToolUse hook registration - Add requirement ID references to all test module docstrings - Add THIS TEST VALIDATES traceability comments to critical tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DW-REQ-012.5.3: make SHOULD violation criterion concrete and testable - PLUG-REQ-001: fix section ordering (001.14 before 001.15) - test_engine.py: use two-level REQ ID format (DW-REQ-012.6 not 012.6.3) - test_hook.py: use two-level REQ ID format, fix traceability comment placement - test_evaluator.py: move tests to correct class, remove redundant decorators Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- test_tool_requirements_hook.py: move import to module level (DRY) - test_evaluator.py: add missing blank line between classes (E302) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.deepwork/tool_requirements/*.ymlfiles with RFC 2119-style requirementsappeal_tool_requirementMCP tool; approvals are cached with a 1-hour TTLno_exceptionrules (cannot be appealed), policy inheritance viaextends, and parameter-level regex matchingTest plan
.deepwork/tool_requirements/test.yml, start MCP server, trigger a tool call that violates a policy🤖 Generated with Claude Code