feat(ai-monitoring): Fetch model context size and rename task to fetch_ai_model_info#112656
Conversation
…h_ai_model_info Extend the fetch_ai_model_costs task to also fetch context size (context window length) for each AI model alongside token costs. Context size is sourced from OpenRouter's context_length field and models.dev's limit.context field, following the same precedence logic as costs (OpenRouter takes priority). The task is renamed from fetch_ai_model_costs to fetch_ai_model_info since it now fetches more than just cost data. The AIModelCostV2 type gains an optional contextSize field (int). Updated references: - Task registration name in server.py cron schedule - Logger metric names in warning messages - All test imports, method names, and assertions Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
| class AIModelCostV2(TypedDict, total=False): | ||
| inputPerToken: Required[float] | ||
| outputPerToken: Required[float] | ||
| outputReasoningPerToken: Required[float] | ||
| inputCachedPerToken: Required[float] | ||
| inputCacheWritePerToken: Required[float] | ||
| contextSize: int |
There was a problem hiding this comment.
Maybe we should update the config version, or use another structure.
There was a problem hiding this comment.
Maybe adding a new field in the config would be the best approach, because I can see us wanting to expand this even further for non cost related things. I am aware that it complicates this all a bit now, but it will allow us to do a quick metadata changes in future.
That new field could be called LLMModelMetadata and for each model would contain cost and context information for now, with a chance to expand it in the future.
There was a problem hiding this comment.
Added type defs for a new schema.
| class AIModelCostV2(TypedDict, total=False): | ||
| inputPerToken: Required[float] | ||
| outputPerToken: Required[float] | ||
| outputReasoningPerToken: Required[float] | ||
| inputCachedPerToken: Required[float] | ||
| inputCacheWritePerToken: Required[float] | ||
| contextSize: int |
There was a problem hiding this comment.
Maybe adding a new field in the config would be the best approach, because I can see us wanting to expand this even further for non cost related things. I am aware that it complicates this all a bit now, but it will allow us to do a quick metadata changes in future.
That new field could be called LLMModelMetadata and for each model would contain cost and context information for now, with a chance to expand it in the future.
…acy task Introduce a new LLMModelMetadata schema with costs nested under a 'costs' field and an optional contextSize. Context size is fetched from OpenRouter's context_length and models.dev's limit.context fields. Both tasks run independently on the same cron schedule: - fetch_ai_model_costs -> writes ai-model-costs:v2 (flat AIModelCostV2) - fetch_llm_model_metadata -> writes llm-model-metadata:v1 (nested LLMModelMetadata) They share raw fetch helpers (_fetch_openrouter_raw, _fetch_models_dev_raw) but format and cache independently. The old task + cache key will be removed once all consumers have migrated. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
| return None | ||
|
|
||
| cached_metadata = cache.get(LLM_MODEL_METADATA_CACHE_KEY) | ||
| if cached_metadata is not None: | ||
| return cached_metadata | ||
|
|
||
| if not settings.IS_DEV: | ||
| # in dev environment, we don't want to log this | ||
| logger.warning("Empty LLM model metadata") | ||
|
|
||
| return None |
There was a problem hiding this comment.
Bug: The new llm_model_metadata_config function is defined but never called, so the fetched LLM model metadata is never added to the global config sent to Relay.
Severity: MEDIUM
Suggested Fix
Update get_global_config in src/sentry/relay/globalconfig.py to call llm_model_metadata_config. Add a new field to the GlobalConfig TypedDict to hold the LLM metadata, and then populate this field with the result from llm_model_metadata_config.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/relay/config/ai_model_costs.py#L78-L98
Potential issue: The `fetch_llm_model_metadata` task correctly fetches and caches LLM
model metadata, including context size. However, this data is never consumed by Relay.
The function `llm_model_metadata_config`, which reads from this cache, is defined but
never called. The central `get_global_config` function in
`src/sentry/relay/globalconfig.py` was not updated to invoke `llm_model_metadata_config`
and integrate its output. Consequently, the `GlobalConfig` sent to Relay lacks the new
model metadata, rendering the context-size fetching feature non-functional.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7c46e0c. Configure here.
| if model_id not in models_dict: | ||
| cost = _models_dev_entry_to_cost(model_id, model_data) | ||
| if cost is not None: | ||
| models_dict[model_id] = cost |
There was a problem hiding this comment.
Refactored models.dev duplicate handling changes precedence order
Low Severity
The refactoring of the models.dev data pipeline subtly changes duplicate-handling semantics for the legacy fetch_ai_model_costs task. Previously, _fetch_models_dev_models built an internal dict where later providers' entries overwrote earlier ones (last-wins for same model_id). The returned dict was then merged into the main dict. Now, _fetch_models_dev_raw returns a flat list preserving all entries, and the caller's if model_id not in models_dict guard means the first provider's entry wins instead of the last. This changes which pricing data is used when the same model ID appears under multiple models.dev providers.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 7c46e0c. Configure here.
…acy task Introduce a new LLMModelMetadata schema with costs nested under a 'costs' field and an optional contextSize. Context size is fetched from OpenRouter's context_length and models.dev's limit.context fields. Both tasks run independently on the same cron schedule: - fetch_ai_model_costs -> writes ai-model-costs:v2 (flat AIModelCostV2) - fetch_llm_model_metadata -> writes llm-model-metadata:v1 (nested LLMModelMetadata) They share raw fetch helpers (_fetch_openrouter_raw, _fetch_models_dev_raw) but format and cache independently. The old task + cache key will be removed once all consumers have migrated. GlobalConfig now serves both fields side by side: - aiModelCosts: legacy flat format (TODO remove) - llmModelMetadata: new nested format with contextSize Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Backend Test FailuresFailures on
|
Relay's normalize_global_config strips unknown fields, causing test_relay_globalconfig_v3 failures. The new cache is still populated and readable via llm_model_metadata_config() but should not be added to Relay's GlobalConfig until Relay supports the field. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
| models_dict[model_id] = metadata | ||
| except Exception as e: |
There was a problem hiding this comment.
Bug: The code uses a strict isinstance(..., int) check for context size, which will silently drop the value if an API returns it as a float (e.g., 1000000.0).
Severity: MEDIUM
Suggested Fix
Modify the type check to be more robust. Instead of a strict isinstance(..., int) check, first verify if the value is an instance of (int, float). If it is, convert it to an integer before assigning it to the metadata. This will correctly handle both integer and float representations of the context size.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/tasks/ai_agent_monitoring.py#L358-L359
Potential issue: In `_openrouter_entry_to_metadata` and `_models_dev_entry_to_metadata`,
the context size from external APIs is validated using a strict `isinstance(..., int)`
check. If an API returns the context size as a float (e.g., `1000000.0` instead of
`1000000`), this check will fail. The code then silently ignores the context size,
failing to add it to the model's metadata. This results in a silent loss of data, as
there is no logging to indicate that a valid, albeit float-formatted, context size was
discarded. This could lead to models being configured in Relay without their context
size, even when the data is available from the source API.


Closes https://linear.app/getsentry/issue/TET-2219/sentry-map-llm-context-size-to-relay-cost-calculation-config
Extend the fetch_ai_model_costs task to also fetch context size (context window length) for each AI model alongside token costs. Context size is sourced from OpenRouter's context_length field and models.dev's limit.context field, following the same precedence logic as costs (OpenRouter takes priority).
The task is renamed from fetch_ai_model_costs to fetch_ai_model_info since it now fetches more than just cost data. The AIModelCostV2 type gains an optional contextSize field (int).
Updated references:
Updates config schema to be passed to relay to this structure now in this config field:
ai-model-info:v3{ "version": 3, "models": { "gpt-4": { "inputPerToken": 0.0000003, "outputPerToken": 0.00000165, "outputReasoningPerToken": 0.0, "inputCachedPerToken": 0.0000015, "inputCacheWritePerToken": 0.00001875, "contextSize": 1000000 }, "claude-3-5-sonnet": { "inputPerToken": 0.000003, "outputPerToken": 0.000015, "outputReasoningPerToken": 0.0, "inputCachedPerToken": 0.0000015, "inputCacheWritePerToken": 0.00000375 } } }Co-Authored-By: Claude Sonnet 4 noreply@anthropic.com