Skip to content

opentelemetry-instrumentation-botocore: capture Bedrock prompt cache token usage#4615

Open
vinigrazzioli-96 wants to merge 3 commits into
open-telemetry:mainfrom
vinigrazzioli-96:botocore-bedrock-cache-tokens
Open

opentelemetry-instrumentation-botocore: capture Bedrock prompt cache token usage#4615
vinigrazzioli-96 wants to merge 3 commits into
open-telemetry:mainfrom
vinigrazzioli-96:botocore-bedrock-cache-tokens

Conversation

@vinigrazzioli-96
Copy link
Copy Markdown

Description

Amazon Bedrock's Converse / ConverseStream APIs return cacheReadInputTokens
and cacheWriteInputTokens in the response usage when prompt caching is
used. The botocore Bedrock instrumentation currently reads only inputTokens
/ outputTokens, so the cache token counts are dropped.

This PR maps them to the OTel GenAI semantic-convention attributes (already
defined in opentelemetry-semantic-conventions):

  • cacheReadInputTokensgen_ai.usage.cache_read.input_tokens
    Changes:
  • bedrock.py (_converse_on_success): set the cache token span attributes
    for Converse / ConverseStream / InvokeModelWithResponseStream.
  • bedrock_utils.py (ConverseStreamWrapper): accumulate the cache token
    counts from the streaming metadata event into the response usage.

Out of scope (possible follow-up): the non-streaming InvokeModel Claude
path uses the native Anthropic usage format (cache_read_input_tokens /
cache_creation_input_tokens, snake_case) and would need a separate change.

Fixes #4614

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Added test_converse_stream_accumulates_cache_tokens, a unit test that
    feeds a metadata event carrying cache token usage to ConverseStreamWrapper
    and asserts the counts are accumulated.
  • tox -e py311-test-instrumentation-botocore-1-wrapt1 — 134 passed
  • tox -e lint-instrumentation-botocore — 10.00/10

Checklist

  • Followed the style guidelines of this project
  • Changelog has been updated
  • Unit tests have been added
  • Documentation has been updated (N/A — internal attribute change)

@vinigrazzioli-96 vinigrazzioli-96 requested a review from a team as a code owner May 21, 2026 14:31
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented May 21, 2026

CLA Signed
The committers listed above are authorized under a signed CLA.

  • ✅ login: vinigrazzioli-96 / name: Vinicius Moscon (55d8c17)

@xrmx xrmx added the gen-ai Related to generative AI label May 27, 2026
assert choice.index == 0


def test_converse_stream_accumulates_cache_tokens():
Copy link
Copy Markdown
Contributor

@xrmx xrmx May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to also assert the attributes on a test with a recording instead, if this isn't something added recently you may already have it in the recordings.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @xrmx! I looked into this — the cache token fields (cacheReadInputTokens / cacheWriteInputTokens) are only part of the bedrock-runtime service model on more recent boto3 (≥ 1.39.5). The current test factors all pin older versions (1.29.4 in -2, 1.35.16 in -3, 1.35.56 in -1), which strip these fields from the response before the instrumentor sees them — so a VCR cassette alone wouldn't help. That's why I went with the unit test on ConverseStreamWrapper to validate the streaming-metadata accumulation.

If you'd prefer a VCR-based test, I can bump boto3 in one of the factors (e.g. -3) to ≥ 1.39.5 and record a cassette with prompt caching enabled. Happy to do either — let me know how you'd like to proceed.

@lzchen lzchen moved this to Ready for review in Python PR digest May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gen-ai Related to generative AI

Projects

Status: Ready for review

Development

Successfully merging this pull request may close these issues.

Amazon Bedrock (botocore) instrumentation does not capture prompt cache token usage

3 participants