Skip to content

azure-monitor-opentelemetry-exporter: oversized gen_ai content truncates customDimensions and drops token-usage dimensions #47345

@erik-gustafsson-vargas

Description

@erik-gustafsson-vargas
  • Package Name: azure-monitor-opentelemetry-exporter
  • Package Version: 1.0.0b52
  • Operating System: macOS 15.5 (arm64)
  • Python Version: 3.12.13

Describe the bug
When a span carries large GenAI semantic-convention content (e.g. gen_ai.input.messages), the queryable gen_ai.usage.input_tokens / gen_ai.usage.output_tokens dimensions are silently lo
st on that span in Application Insights, so token usage is undercounted/misreported.

The exporter forwards GenAI content attributes up to a 256 KB limit (the GenAI-attribute exemption in _filter_custom_properties, azure/monitor/opentelemetry/exporter/_utils.py). But t
he ingested row's customDimensions is truncated at ~64 KB (we consistently observe strlen(tostring(customDimensions)) == 65532). Because the truncation cuts the serialized property ba
g at that boundary, any attribute serialized after the large content value — including the tiny but high-value gen_ai.usage.* — is dropped and becomes non-queryable.

To Reproduce

  1. Send one large and one small GenAI span through the raw AzureMonitorTraceExporter:
import os
from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
provider = TracerProvider(resource=Resource.create({"service.name": "genai-repro"}))
provider.add_span_processor(
    BatchSpanProcessor(
        AzureMonitorTraceExporter.from_connection_string(os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"])
    )
)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("repro")
for name, content, in_tok in (
    ("chat repro-big", "[" + "x" * 71_000 + "]", 71_000),   # ~71 KB content
    ("chat repro-small", '[{"role":"user","content":"hi"}]', 5),
):
    with tracer.start_as_current_span(name) as span:
        span.set_attribute("gen_ai.system", "openai")
        span.set_attribute("gen_ai.request.model", "gpt-4o")
        span.set_attribute("gen_ai.input.messages", content)
        span.set_attribute("gen_ai.usage.input_tokens", in_tok)
        span.set_attribute("gen_ai.usage.output_tokens", 7)
provider.force_flush()
provider.shutdown()
  1. Wait ~2 minutes for ingestion, then run in Logs:

    dependencies
    | where timestamp > ago(30m) and name startswith "chat repro"
    | extend cd = tostring(customDimensions)
    | project name,
              usage_tokens_queryable = tostring(customDimensions["gen_ai.usage.input_tokens"]),
              input_msgs_len         = strlen(tostring(customDimensions["gen_ai.input.messages"])),
              customDimensions_len   = strlen(cd)
    | order by name asc
  2. Observe that the large span is present (confirmable by operation_Id) but its token dimension is gone:

    name usage_tokens_queryable input_msgs_len customDimensions_len
    chat repro-big (empty) ~64000 65532
    chat repro-small 5 ~30 ~200

Expected behavior
A span exceeding the ingestion limit should not silently lose its small, high-value dimensions. customDimensions should stay valid JSON with all properties queryable — truncate the oversized value (gen_ai.input.messages), not clip the whole blob into a string that drops the other dimensions

Metadata

Metadata

Assignees

Labels

customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-triageWorkflow: This is a new issue that needs to be triaged to the appropriate team.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions