feat(webapp): split Models into Your models and Model library tabs#3958
Conversation
🦋 Changeset detectedLatest commit: e7e6e84 The changes in this PR will be included in the next version bump. This PR includes changesets to release 25 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThe PR adds a "Your models" tab to the Models page with project-scoped usage metrics and prompt-cache insights. SVG provider icons are updated to use React camelCase attributes. A new 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The Your models sparklines use dynamic bucket sizes (6h at 7d, etc.), but the tooltip assumed hourly buckets and showed wrong dates. Thread the bucket interval and start through so each bar is labelled correctly. Also pin the library tab cross-tenant p50 TTFC column to a fixed 7-day window so it no longer follows the Your models time selector.
Your models gets a cache-savings column and per-model cached-tokens and cache-hit-rate views; the AI metrics dashboard gets a caching section (hit rate, cached tokens, estimated savings, hit rate by model). Also makes the Your models charts all time-series for consistency.
The cache hit-rate and savings queries divided by zero for models with no cached tokens, surfacing NaN or empty widgets; they now return 0 via ifNull/nullIf. Model usage sparklines bucketed on a timezone-dependent DateTime string, which could misalign bars with the charts above them; they now key on toUnixTimestamp so buckets line up regardless of the ClickHouse server timezone.
input_tokens is the total prompt count, inclusive of cache-read and cache-creation tokens. The cost pipeline charged the full input count at the input price and then added a separate cache line, so cached tokens were billed twice (e.g. ~2.4x on OpenAI), and the cache hit-rate metric divided cached reads by input + cached, understating the rate. Charge the input price only on the fresh (non-cached) remainder, resolve cache prices across provider alias keys (falling back to input price so cache tokens are never free), and compute the hit rate as cached / input.
The prompt detail Metrics tab now shows a Cached tokens total, a cache hit-rate-over-time chart, and a cached-tokens-over-time chart, matching the model detail page. Avg input cost and input cost per 1k now include the cache-read and cache-creation cost lines so they reflect total input spend rather than the fresh-input cost alone.
…add 1h cache alias Scope getModelUsageSparklines by project_id alongside environment_id so it matches the other project-scoped queries and lets ClickHouse use the organization/project/environment key prefix. Add input_cache_creation_1h to the cache-creation price aliases so a model that defines only the 1h key is not dropped to the input price (no current model is affected; the base/5m alias still resolves first).
The span API ai object now returns cachedCost and cacheCreationCost alongside inputCost/outputCost/totalCost. Since inputCost covers only the non-cached input, these fields let consumers reconstruct the full cost breakdown for prompt-cached calls instead of seeing an unexplained gap below totalCost.
@trigger.dev/build
trigger.dev
@trigger.dev/core
@trigger.dev/python
@trigger.dev/react-hooks
@trigger.dev/redis-worker
@trigger.dev/rsc
@trigger.dev/schema-to-json
@trigger.dev/sdk
commit: |
When TimeFilter is used in controlled mode (onValueChange provided), it now takes period/from/to only from props instead of falling back to the URL search params. Selecting a custom date range in the model detail panel (which sets period to undefined) no longer reverts the filter display to the page-level URL period.
Summary
The Models page is now split into two tabs. Your models shows the models your project has actually used in the selected time range, with usage charts (cost over time, tokens over time, calls by model), a per-model table of calls / cost / avg TTFC / avg tokens-per-sec, and calls/tokens trend sparklines. Model library is the full catalog, reordered from alphabetical to a relevance-based provider order (Anthropic, OpenAI, Google, then the rest), newest models first within each provider, with a "New" badge on models released in the last 7 days.
One time-range selector drives the whole Your models tab, so the charts, the table, and the sparklines all share the same window. Opening a model shows its own metrics with an independent range picker and a "View in AI metrics" link that opens the AI metrics dashboard filtered to that model. The active tab is kept in the URL so it survives a refresh and is shareable.
Prompt caching & cost accuracy
Both the Your models tab and the AI metrics dashboard now surface prompt-cache usage: a cache-savings column plus per-model cached-tokens and cache-hit-rate views, and a caching section on the dashboard (hit rate, cached tokens, estimated savings, and hit rate by model).
Building this surfaced a cost bug.
input_tokensis the total prompt count and already includes cache-read and cache-creation tokens, but the cost pipeline charged the full input at the input price and then added a separate cache line, so cached tokens were billed twice (and on Anthropic, cache reads were never discounted because their price is keyed differently). The input price now applies only to the non-cached remainder, with cache prices resolved across the provider-specific keys, so LLM cost and the cache hit-rate metric are accurate. Hit rate is computed as cached reads over total input.Notes
Also fixes React "invalid DOM property" console warnings from the provider icons (the Llama and DeepSeek SVGs used raw
fill-rule/clip-rule/clip-pathattributes), which this page surfaces by rendering more provider icons.Screenshots
Your models tab: usage charts and a per-model table with calls/tokens trend sparklines.
Model library: provider-relevance ordering with a "New" badge on models released in the last 7 days.
Model detail, Metrics tab: per-model range picker and a "View in AI metrics" link.
View in AI metrics: the dashboard deep-linked and filtered to the selected model.