opentelemetry-instrumentation-botocore: capture Bedrock prompt cache token usage#4615
Conversation
|
|
| assert choice.index == 0 | ||
|
|
||
|
|
||
| def test_converse_stream_accumulates_cache_tokens(): |
There was a problem hiding this comment.
It would be nice to also assert the attributes on a test with a recording instead, if this isn't something added recently you may already have it in the recordings.
There was a problem hiding this comment.
Thanks for the review @xrmx! I looked into this — the cache token fields (cacheReadInputTokens / cacheWriteInputTokens) are only part of the bedrock-runtime service model on more recent boto3 (≥ 1.39.5). The current test factors all pin older versions (1.29.4 in -2, 1.35.16 in -3, 1.35.56 in -1), which strip these fields from the response before the instrumentor sees them — so a VCR cassette alone wouldn't help. That's why I went with the unit test on ConverseStreamWrapper to validate the streaming-metadata accumulation.
If you'd prefer a VCR-based test, I can bump boto3 in one of the factors (e.g. -3) to ≥ 1.39.5 and record a cassette with prompt caching enabled. Happy to do either — let me know how you'd like to proceed.
Description
Amazon Bedrock's Converse / ConverseStream APIs return
cacheReadInputTokensand
cacheWriteInputTokensin the responseusagewhen prompt caching isused. The botocore Bedrock instrumentation currently reads only
inputTokens/
outputTokens, so the cache token counts are dropped.This PR maps them to the OTel GenAI semantic-convention attributes (already
defined in
opentelemetry-semantic-conventions):cacheReadInputTokens→gen_ai.usage.cache_read.input_tokensChanges:
bedrock.py(_converse_on_success): set the cache token span attributesfor Converse / ConverseStream / InvokeModelWithResponseStream.
bedrock_utils.py(ConverseStreamWrapper): accumulate the cache tokencounts from the streaming
metadataevent into the response usage.Out of scope (possible follow-up): the non-streaming
InvokeModelClaudepath uses the native Anthropic
usageformat (cache_read_input_tokens/cache_creation_input_tokens, snake_case) and would need a separate change.Fixes #4614
Type of change
How Has This Been Tested?
test_converse_stream_accumulates_cache_tokens, a unit test thatfeeds a
metadataevent carrying cache token usage toConverseStreamWrapperand asserts the counts are accumulated.
tox -e py311-test-instrumentation-botocore-1-wrapt1— 134 passedtox -e lint-instrumentation-botocore— 10.00/10Checklist