fix: add idempotency guard for tool re-execution on task retry#5822
fix: add idempotency guard for tool re-execution on task retry#5822mahasarabesh wants to merge 6 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds idempotent tool pre-claim sentinels and an ChangesIdempotent Tool Execution and Per-Task Retry Reset
Sequence DiagramsequenceDiagram
participant ToolUsage
participant CacheHandler
participant CrewAgentExecutor
participant NativeTool
ToolUsage->>CacheHandler: read(tool_key)
alt cached sentinel present
CacheHandler-->>ToolUsage: IDEMPOTENT_SENTINEL_MESSAGE
else not cached
ToolUsage->>CacheHandler: claim_if_absent(tool_key, IDEMPOTENT_EXECUTION_SENTINEL)
alt claim succeeded
ToolUsage->>CrewAgentExecutor: invoke native tool
CrewAgentExecutor->>NativeTool: run(args)
alt success
NativeTool-->>CrewAgentExecutor: raw_result
CrewAgentExecutor->>CacheHandler: write(tool_key, raw_result)
CrewAgentExecutor-->>ToolUsage: raw_result
else failure
NativeTool-->>CrewAgentExecutor: exception
CrewAgentExecutor-->>ToolUsage: error (sentinel remains)
end
else existing value returned
CacheHandler-->>ToolUsage: existing_value (stringified or IDEMPOTENT_SENTINEL_MESSAGE)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/crewai/src/crewai/agents/crew_agent_executor.py (1)
959-984:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEnsure idempotent success always overwrites the pre-claim sentinel.
For idempotent tools, a
cache_functionreturningFalsecan leave the sentinel in cache permanently, causing future calls to return sentinel-derived output instead of the real result.Suggested fix
- if self.tools_handler and self.tools_handler.cache: - should_cache = True - if ( - original_tool - and hasattr(original_tool, "cache_function") - and callable(original_tool.cache_function) - ): - should_cache = original_tool.cache_function( - args_dict or {}, raw_result - ) - if should_cache: - self.tools_handler.cache.add( - tool=func_name, input=input_str, output=raw_result - ) + if self.tools_handler and self.tools_handler.cache: + if is_idempotent: + # Always replace pre-claim on success. + self.tools_handler.cache.add( + tool=func_name, input=input_str, output=raw_result + ) + else: + should_cache = True + if ( + original_tool + and hasattr(original_tool, "cache_function") + and callable(original_tool.cache_function) + ): + should_cache = original_tool.cache_function( + args_dict or {}, raw_result + ) + if should_cache: + self.tools_handler.cache.add( + tool=func_name, input=input_str, output=raw_result + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/agents/crew_agent_executor.py` around lines 959 - 984, The idempotent sentinel may be left in cache if an idempotent tool's cache_function returns False; change the flow in crew_agent_executor.py so that after calling the tool (raw_result = available_functions[func_name](...)), if is_idempotent (as computed earlier) we always overwrite the pre-claimed IDEMPOTENT_EXECUTION_SENTINEL with the real raw_result in self.tools_handler.cache (i.e., call self.tools_handler.cache.add(tool=func_name, input=input_str, output=raw_result) unconditionally for idempotent tools), while still preserving the existing cache_function logic for non-idempotent caching decisions.lib/crewai/src/crewai/experimental/agent_executor.py (1)
1738-1748:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd sentinel detection when reading from cache.
When an idempotent tool crashes after writing the sentinel (line 1811-1815) but before completing, the sentinel remains in cache. On retry, this code reads the cached sentinel and returns it as a successful tool result (
result = str(cached_result)), rather than recognizing it as a marker of a previously failed execution.This breaks the idempotency guarantee: the agent receives
IDEMPOTENT_EXECUTION_SENTINELas the tool output and may treat it as success, when it should receive an error or be prevented from re-executing.Recommended fix
if self.tools_handler and self.tools_handler.cache: cached_result = self.tools_handler.cache.read( tool=func_name, input=input_str ) if cached_result is not None: + from crewai.tools.base_tool import IDEMPOTENT_EXECUTION_SENTINEL + if cached_result == IDEMPOTENT_EXECUTION_SENTINEL: + result = f"Tool '{func_name}' execution previously failed or is in progress and cannot be retried (idempotent tool)." + from_cache = True + # Skip to cleanup - do not re-execute + else: - result = ( - str(cached_result) - if not isinstance(cached_result, str) - else cached_result - ) - from_cache = True + result = ( + str(cached_result) + if not isinstance(cached_result, str) + else cached_result + ) + from_cache = True🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/experimental/agent_executor.py` around lines 1738 - 1748, The cache-read block that sets result/from_cache must detect the sentinel marker and not return it as a successful tool output; after calling self.tools_handler.cache.read (the cached_result variable), check whether cached_result equals the sentinel constant (IDEMPOTENT_EXECUTION_SENTINEL) and if so treat it as a failed/in-progress execution (e.g., raise a specific error or treat it like a cache miss) instead of setting result/from_cache; update the logic around tools_handler.cache.read, cached_result, result and from_cache to skip/handle sentinel values appropriately so the agent does not receive the sentinel string as a successful tool result.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/tools/base_tool.py`:
- Around line 85-88: Replace the plain-string sentinel variable
IDEMPOTENT_EXECUTION_SENTINEL with a non-collidable structured payload (e.g., a
dict/object containing a reserved marker key like "__crewai_idempotent_sentinel"
and a version/id) and update all read/write paths that currently compare against
the string to instead detect the sentinel by the presence of that reserved key
(not by user-visible text). Ensure the new sentinel is unique and stable
(include a small version or UUID field if needed) and update any code that
writes the sentinel (in base_tool.py) and any code that reads it to check for
the reserved key shape rather than matching the message text.
In `@lib/crewai/tests/agents/test_agent.py`:
- Around line 1172-1181: Test currently only checks _times_executed > 0 and can
pass if the counter leaks across tasks; modify it to assert per-task retry
counts by mocking or spying on the agent.invoke method and verifying calls are
per-task. Specifically, for task1 call create_agent_executor(task=task1) then
execute_task(task=task1) and assert invoke was called expected_times1 (e.g.,
max_retry_limit+1); then call create_agent_executor(task=task2), run
execute_task(task=task2) inside pytest.raises and assert invoke call count
increased by expected_times2 (e.g., another max_retry_limit+1) rather than
reusing the same attempts; also verify agent._times_executed reflects the
per-task count (reset between tasks) and not a leaked cumulative value. Ensure
references to _times_executed, create_agent_executor, execute_task, invoke, and
max_retry_limit are used to locate and update the assertions.
In `@lib/crewai/tests/agents/test_native_tool_calling.py`:
- Around line 1342-1371: The test introduces missing coverage for retry behavior
when the idempotent sentinel is already cached: add a new test (e.g.,
test_idempotent_tool_retry_reads_sentinel) that calls the same failing tool
twice using the existing helper _make_executor and
executor._execute_single_native_tool_call, assert the first call writes
IDEMPOTENT_EXECUTION_SENTINEL and returns an error, then assert the second call
reads from cache (result["from_cache"] is True), does not return the literal
IDEMPOTENT_EXECUTION_SENTINEL as the result, and returns a clear message
indicating the tool cannot be retried or previously failed; this will surface
the bug in agent_executor._execute_single_native_tool_call handling of cached
sentinels so you can update that function to detect the sentinel and return an
appropriate error instead of the raw sentinel.
---
Outside diff comments:
In `@lib/crewai/src/crewai/agents/crew_agent_executor.py`:
- Around line 959-984: The idempotent sentinel may be left in cache if an
idempotent tool's cache_function returns False; change the flow in
crew_agent_executor.py so that after calling the tool (raw_result =
available_functions[func_name](...)), if is_idempotent (as computed earlier) we
always overwrite the pre-claimed IDEMPOTENT_EXECUTION_SENTINEL with the real
raw_result in self.tools_handler.cache (i.e., call
self.tools_handler.cache.add(tool=func_name, input=input_str, output=raw_result)
unconditionally for idempotent tools), while still preserving the existing
cache_function logic for non-idempotent caching decisions.
In `@lib/crewai/src/crewai/experimental/agent_executor.py`:
- Around line 1738-1748: The cache-read block that sets result/from_cache must
detect the sentinel marker and not return it as a successful tool output; after
calling self.tools_handler.cache.read (the cached_result variable), check
whether cached_result equals the sentinel constant
(IDEMPOTENT_EXECUTION_SENTINEL) and if so treat it as a failed/in-progress
execution (e.g., raise a specific error or treat it like a cache miss) instead
of setting result/from_cache; update the logic around tools_handler.cache.read,
cached_result, result and from_cache to skip/handle sentinel values
appropriately so the agent does not receive the sentinel string as a successful
tool result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 64fce7e2-c0ac-4576-b88f-dda51bcd207b
📒 Files selected for processing (8)
lib/crewai/src/crewai/agent/core.pylib/crewai/src/crewai/agents/crew_agent_executor.pylib/crewai/src/crewai/experimental/agent_executor.pylib/crewai/src/crewai/tools/base_tool.pylib/crewai/src/crewai/tools/tool_usage.pylib/crewai/tests/agents/test_agent.pylib/crewai/tests/agents/test_native_tool_calling.pylib/crewai/tests/tools/test_base_tool.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/experimental/agent_executor.py`:
- Around line 1812-1820: The idempotent cache is using input_str directly which
can vary by non-canonical JSON; change the cache key to a canonicalized
representation of the tool arguments before any pre-claim/read/write.
Specifically, in the idempotent path around is_idempotent/original_tool handling
(functions and variables: is_idempotent, original_tool, tools_handler.cache.add,
IDEMPOTENT_EXECUTION_SENTINEL, input_str), compute a canonical argument key
(e.g., deterministic JSON serialization of the argument object) and use that
canonical key instead of input_str when calling cache.add and in the
corresponding pre-claim/read/write code paths (also update the analogous section
at the later block covering lines 1826-1840) so equivalent argument objects map
to the same cache entry.
In `@lib/crewai/src/crewai/tools/tool_usage.py`:
- Around line 326-341: The idempotency cache key uses json.dumps on dict
arguments without canonical ordering which causes semantically identical dicts
with different key orders to miss dedupe; update the serialization in the
idempotent-input construction used before calling self.tools_handler.cache.add
(refer to sanitize_tool_name, calling.arguments, and tools_handler.cache.add) to
serialize dicts via json.dumps(calling.arguments, sort_keys=True,
separators=(',', ':')) (or equivalent canonical JSON) so keys are stable; apply
the same change in the other occurrence around the block that spans the code
referenced at lines ~581-597.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 3b6e32e8-be4a-498c-a117-3438caf5475d
📒 Files selected for processing (6)
lib/crewai/src/crewai/agents/crew_agent_executor.pylib/crewai/src/crewai/experimental/agent_executor.pylib/crewai/src/crewai/tools/base_tool.pylib/crewai/src/crewai/tools/tool_usage.pylib/crewai/tests/agents/test_agent.pylib/crewai/tests/agents/test_native_tool_calling.py
🚧 Files skipped from review as they are similar to previous changes (2)
- lib/crewai/src/crewai/agents/crew_agent_executor.py
- lib/crewai/src/crewai/tools/base_tool.py
|
The pre-claim approach is the right instinct and the Process restart survival. Pre-claim window. Writing The fix for both: the claim needs to live in SQLite or Postgres outside the CrewAI execution context. Happy to contribute a durable backend option that slots into this PR's |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/agents/crew_agent_executor.py`:
- Around line 894-909: The current non-atomic read-then-claim flow around
tools_handler.cache allows races; replace the separate read
(tools_handler.cache.read) + later write with a single atomic "claim-if-absent"
/ CAS operation on the cache (e.g., cache.claim_if_absent or
cache.set_if_absent) so that you only proceed to execute the tool when the claim
is successfully acquired, and if the claim fails immediately return the existing
cached value (honoring is_idempotent_sentinel and IDEMPOTENT_SENTINEL_MESSAGE
logic); update both the pre-check block that uses tools_handler.cache.read and
the later claim/write block (the alternate path around the execution/claim at
lines referenced in the comment) to use this atomic API and ensure non-claiming
callers return the cached/sentinel result without running the tool.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: d380eafa-0b6c-4fae-bf12-6d61dea856c8
📒 Files selected for processing (4)
lib/crewai/src/crewai/agents/crew_agent_executor.pylib/crewai/src/crewai/experimental/agent_executor.pylib/crewai/src/crewai/tools/tool_usage.pylib/crewai/tests/agents/test_native_tool_calling.py
🚧 Files skipped from review as they are similar to previous changes (3)
- lib/crewai/src/crewai/experimental/agent_executor.py
- lib/crewai/tests/agents/test_native_tool_calling.py
- lib/crewai/src/crewai/tools/tool_usage.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/agents/cache/cache_handler.py`:
- Around line 60-64: The current check uses self._cache.get(key) so a cached
None is treated as a miss; instead detect presence explicitly and return the
stored value (even if None): inside the w_locked() block, test membership (e.g.,
if key in self._cache) and then retrieve existing = self._cache[key] and return
False, existing; alternatively use a unique sentinel with self._cache.get(key,
SENTINEL) to distinguish missing entries. Update the code around the membership
check in cache_handler (the block using self._lock.w_locked(), key, and
existing) accordingly.
In `@lib/crewai/src/crewai/experimental/agent_executor.py`:
- Around line 1752-1762: The cached idempotent path must short-circuit
immediately when claim_if_absent returns an existing entry: if claimed is False,
set result (using IDEMPOTENT_SENTINEL_MESSAGE if
is_idempotent_sentinel(existing) else the stringified existing), set
from_cache=True, and return that result right away instead of falling through to
the started-event and hook pipeline; update the logic around claim_if_absent
(referencing claim_if_absent, claimed, existing, IDEMPOTENT_EXECUTION_SENTINEL,
IDEMPOTENT_SENTINEL_MESSAGE, is_idempotent_sentinel, and from_cache) so no
events or hooks run when returning a cached idempotent value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9a0f804e-5445-4523-8845-9ed2c36d01eb
📒 Files selected for processing (3)
lib/crewai/src/crewai/agents/cache/cache_handler.pylib/crewai/src/crewai/agents/crew_agent_executor.pylib/crewai/src/crewai/experimental/agent_executor.py
🚧 Files skipped from review as they are similar to previous changes (1)
- lib/crewai/src/crewai/agents/crew_agent_executor.py
…, always overwrite for idempotent, stronger tests
Add CacheBackend protocol and two implementations: - InMemoryCacheBackend (default, same behavior as before) - SQLiteCacheBackend (survives worker restarts, shared across processes) CacheHandler now delegates to a pluggable backend. Crew accepts an optional cache_backend parameter — tools using idempotent=True work unchanged, the only difference is where the claim is stored. Addresses feedback on crewAIInc#5802 requesting durable storage for the idempotency guard to cover multi-process/distributed retries.
0e7f18a to
e094bad
Compare
Fixes #5802
When a task retries (e.g. after an LLM parsing failure or timeout), tools that already executed can get called again because:
This adds an opt-in idempotent=True field on BaseTool. When set, the tool's cache entry is written before execution (a "pre-claim"). If the tool then crashes after performing its side effect, the pre-claim stays in cache and prevents re-execution on retry. On success, the real result overwrites the pre-claim as normal.
Also fixes the retry counter leak — _times_executed now resets when a different task starts, without affecting the recursive retry logic for the same task.
Usage:
Non-idempotent tools (the default) are completely unaffected.
Summary by CodeRabbit
Bug Fixes
New Features
Tests