Skip to content

fix(crewai): make traceai_crewai instrumentor compatible with crewai 1.x#168

Open
SuhaniNagpal7 wants to merge 1 commit into
devfrom
fix/crewai-1.x-compat
Open

fix(crewai): make traceai_crewai instrumentor compatible with crewai 1.x#168
SuhaniNagpal7 wants to merge 1 commit into
devfrom
fix/crewai-1.x-compat

Conversation

@SuhaniNagpal7
Copy link
Copy Markdown
Contributor

Summary

traceai_crewai was written against crewai 0.4x. With the instrumentor installed on a host application that has upgraded to crewai 1.x, Crew.kickoff() raises before reaching the wrapped call — so the entire crew workflow fails the moment tracing is enabled. This PR fixes the two hard 1.x crashes plus two latent issues that surface on streaming kickoffs (crew.stream=True), with the minimum diff necessary. All four changes preserve behavior on the older crewai versions the package already supported.

Verified end to end against crewai 1.14.4: instrumented Crew.kickoff ran cleanly, spans shipped to Future AGI Observe, agent run + task execute spans rendered with usage tokens (prompt 99 / completion 20 / total 119).

What was breaking on crewai 1.x

1. agent.i18n.prompt_file no longer exists — crashes every Crew.kickoff()

In crewai 1.x, i18n was demoted from an Agent field to a module-level singleton I18N_DEFAULT exposed by crewai.utilities.i18n. The wrapper's crew_agents dict-comp reads agent.i18n.prompt_file for every agent in crew.agents, which raises AttributeError. Because the surrounding span block sets record_exception=False, set_status_on_exception=False, the exception propagates out before wrapped() ever runs. Effect: every Crew.kickoff() call crashes the moment CrewAIInstrumentor().instrument() is called.

2. Task.context defaults to _NotSpecified, not None — crashes every Crew.kickoff()

In crewai 1.x Task.context is typed list[Task] | None | _NotSpecified and defaults to the sentinel. The wrapper's if task.context else None truthiness check passes for the truthy sentinel, then the list-comp [task.description for task in task.context] raises TypeError: '_NotSpecified' object is not iterable. Same crash surface as (1) — happens before wrapped() runs.

3. crew.usage_metrics can be None on streaming kickoffs

With crew.stream=True, crewai 1.x returns a CrewStreamingOutput immediately and never reaches self.usage_metrics = self.calculate_usage_metrics() inside Crew.kickoff before returning. The wrapper's existing else branch then reads usage_metrics.prompt_tokens on None and raises AttributeError.

4. CrewStreamingOutput has no to_dict()

The wrapper's if crew_output_dict := crew_output.to_dict(): raises AttributeError when streaming returns the streaming-output wrapper instead of a CrewOutput.

What this patch changes

All scoped to python/frameworks/crewai/traceai_crewai/_wrappers.py:

  • agent.i18n.prompt_file → safe helper. Added a small _agent_i18n_prompt_file(agent) that prefers agent.i18n (older crewai) and falls back to the crewai.utilities.i18n.I18N_DEFAULT singleton (1.x). The I18N_DEFAULT import is wrapped in try/except ImportError so the package still loads on very old crewai where the singleton path doesn't exist.

  • Task.context truthiness → isinstance(list) guard. Now [t.description for t in task.context] if isinstance(task.context, list) else None. Also renames the inner loop variable from task to t so it stops shadowing the outer task in the surrounding comprehension — a latent bug that was just masked by the earlier crash.

  • else:elif usage_metrics is not None:. Streaming runs skip the object branch instead of crashing on None.prompt_tokens. The earlier isinstance(usage_metrics, dict) branch is unchanged.

  • Walrus on crew_output.to_dict()getattr + callable guard. to_dict = getattr(crew_output, 'to_dict', None); crew_output_dict = to_dict() if callable(to_dict) else None — falls through to str(crew_output) when absent.

What's NOT changed

I deliberately kept the diff to the four 1.x regressions. The following were considered and dropped as out-of-scope:

  • isinstance(args[0], Agent) in _get_input_value — would degrade for non-Agent BaseAgent subclasses (LiteAgent, agent adapters), but isn't a crash on a normal CrewAI Agent. Future-proofing, not a 1.x regression.
  • span.set_attribute("function_calling_llm", instance.function_calling_llm)function_calling_llm is now an LLM object in 1.x, but OTel silently drops non-primitive attribute values. No crash; same behavior as 0.x where it was a LangChain ChatModel.
  • A duplicated set_attribute(OUTPUT_VALUE, ...) line in _ToolUseWrapper. Pre-existing dead code, not in scope.
  • LLM model / cost capture — by design the CrewAI instrumentor only emits framework spans (Crew.kickoff, Task._execute_core, ToolUsage._use). Model name and cost come from running OpenAIInstrumentor (or equivalent provider instrumentor) alongside this one. Same pattern as the other framework instrumentors in this repo.

Test plan

  • python -m py_compile python/frameworks/crewai/traceai_crewai/_wrappers.py — syntax clean.
  • Fresh venv (Python 3.12) with crewai==1.14.4, fi-instrumentation-otel==1.0.0, and traceai-crewai installed editable from this branch.
  • Smoke test: single-agent + single-task Crew, crew.kickoff() against OpenAI gpt-4o-mini. Before fix: crashes with _NotSpecified TypeError. After fix: runs to completion, returns model output, no exceptions.
  • Trace lands in Future AGI Observe under the configured project:
    • Crew.kickoff root span: status OK, duration 3.4s, tokens prompt=99 / completion=20 / total=119, gen_ai.span.kind=CHAIN, crew_agents and crew_tasks blobs populated, crew_tasks[].context = null (correctly handled _NotSpecified sentinel), crew_agents[].i18n = null (correctly handled missing field), output.value = the agent's generated text.
    • Task._execute_core child span present and parented correctly.

Files changed

  • python/frameworks/crewai/traceai_crewai/_wrappers.py — 28 insertions, 6 deletions.

The traceai_crewai wrappers were written against crewai ~0.4x and break
when the host application has upgraded to crewai 1.x. With the
instrumentor installed, `Crew.kickoff()` raises before reaching the
wrapped call — so the entire crew workflow fails the moment tracing is
enabled. This patch fixes the two hard 1.x crashes plus two latent
issues that surface when `crew.stream=True`, while preserving behavior
on the older versions the instrumentor already supported.

Verified end to end against crewai 1.14.4: instrumented Crew.kickoff
ran cleanly, spans shipped to Future AGI Observe, agent run + task
execute spans rendered with usage tokens (prompt 99 / completion 20 /
total 119).

What was breaking on crewai 1.x

1. `agent.i18n.prompt_file` no longer exists.
   `i18n` was demoted from an Agent field to a module-level singleton
   `I18N_DEFAULT` exposed by `crewai.utilities.i18n`. The wrapper's
   `crew_agents` dict-comp reads `agent.i18n.prompt_file` for every
   agent, raising AttributeError. Because the span block uses
   `record_exception=False, set_status_on_exception=False`, the
   exception propagates out and crashes every Crew.kickoff() call.

2. `Task.context` defaults to `_NotSpecified`, not `None`.
   In crewai 1.x `Task.context` is typed `list[Task] | None |
   _NotSpecified` and defaults to the sentinel. The wrapper's
   `if task.context else None` truthiness check passes for the sentinel,
   then the list-comprehension tries to iterate it and raises
   `TypeError: '_NotSpecified' object is not iterable`. Same crash
   surface as (1) — happens before `wrapped()` runs.

3. `crew.usage_metrics` can be None on streaming kickoffs.
   With `crew.stream=True`, crewai 1.x returns a `CrewStreamingOutput`
   immediately and never gets to `self.usage_metrics =
   self.calculate_usage_metrics()` before returning. The wrapper's
   `else` branch then reads `usage_metrics.prompt_tokens` and raises
   AttributeError.

4. `CrewStreamingOutput` has no `to_dict()`.
   The wrapper's `if crew_output_dict := crew_output.to_dict():` raises
   AttributeError when streaming returns the streaming-output wrapper
   instead of a CrewOutput.

What this patch changes

- Replace `agent.i18n.prompt_file` with a small helper that prefers
  `agent.i18n` (older crewai) and falls back to the
  `crewai.utilities.i18n.I18N_DEFAULT` singleton (1.x), with both
  imports wrapped in try/except so the package still loads on very old
  crewai where neither shape exists.

- Replace the `Task.context` truthiness check with
  `isinstance(task.context, list)`. While there, rename the inner loop
  variable to `t` so it stops shadowing the outer `task` in the
  surrounding comprehension.

- Change the usage-metrics fallback from `else:` to
  `elif usage_metrics is not None:` so streaming runs skip the object
  branch instead of crashing.

- Replace the walrus on `crew_output.to_dict()` with a `getattr` +
  `callable` guard, falling through to `str(crew_output)` when absent.

All four changes preserve the existing code paths on older crewai
versions — only the failure modes new to 1.x are handled differently.
The four span attributes the wrapper writes (input/output values,
prompt/completion/total tokens, crew_agents/crew_tasks blobs,
gen_ai.span.kind=CHAIN) are unchanged.
@SuhaniNagpal7 SuhaniNagpal7 self-assigned this May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant