Description
Found via static analysis of prompt-cache anti-patterns (CacheLint), then confirmed by hand on main @ f0fb1e1.
ProviderTransform.message() decides whether to call applyCaching() using a hard-coded model-name gate (packages/opencode/src/provider/transform.ts:285-297):
if (
(model.providerID === "anthropic" ||
model.providerID === "google-vertex-anthropic" ||
model.providerID === "altimate-backend" ||
model.api.id.includes("anthropic") ||
model.api.id.includes("claude") ||
model.id.includes("anthropic") ||
model.id.includes("claude") ||
model.api.npm === "@ai-sdk/anthropic") &&
model.api.npm !== "@ai-sdk/gateway"
) {
msgs = applyCaching(msgs, model)
}
But applyCaching() itself already defines cache directives for five providers, not just Anthropic (transform.ts:196-210):
const providerOptions = {
anthropic: { cacheControl: { type: "ephemeral" } },
openrouter: { cacheControl: { type: "ephemeral" } },
bedrock: { cachePoint: { type: "default" } },
openaiCompatible: { cache_control: { type: "ephemeral" } },
copilot: { copilot_cache_control:{ type: "ephemeral" } },
}
So a cacheable model served through openrouter / openai-compatible / copilot whose id contains neither claude nor anthropic (e.g. a GPT / Gemini / Qwen / Kimi routed through those providers) never enters applyCaching() and never gets a cache breakpoint — even though the function clearly intends to cache it.
This is an under-claim: caching silently fails to engage. It is not cache-busting, and Anthropic-named models are unaffected.
Impact
For an affected model, the system prefix (system prompt + tool schema + earlier turns) is re-sent at full input price on every turn instead of being read from cache. On long agentic loops that is roughly the usual cached-prefix discount forgone each turn, plus higher TTFB — for exactly the self-hosted / BYO-LLM users the project targets. The blast radius is bounded to non-Anthropic-named models that genuinely support explicit cache_control via openrouter/openai-compatible/copilot.
Steps to reproduce
- Configure a cacheable model through
openrouter (or openai-compatible / copilot) whose id does not contain claude/anthropic (e.g. an OpenRouter-served model that honors cache_control).
- Run a multi-turn session.
- Observe that no
cacheControl/cache_control provider option is stamped on the system/last-user blocks (the applyCaching branch is skipped), so the prefix is billed as fresh input every turn.
Suggested fix (for discussion)
Decouple the applyCaching gate from the model-name list and drive it off an explicit capability/provider-support signal, so the gate matches the set of providers applyCaching already knows how to cache:
- introduce a
capabilities.caching flag on the model (the capabilities schema in provider.ts:787 currently has no caching field), populated from the model registry; or
- gate on a per-provider "supports prompt cache" set covering
anthropic / openrouter / bedrock / openaiCompatible / copilot (the same five keys already in providerOptions).
I want to flag the design angle rather than send a drive-by PR: this is a hot path, and #891 was deliberately deferred for the same "needs design + careful testing, don't regress gateway cache-hit rates" reason. It also overlaps with #891's goal of having a single source of truth for the cache-control gate. I'm happy to open a PR if a maintainer confirms the preferred shape (capability flag vs provider set) and that emitting these directives for non-Anthropic providers is intended.
Caveat: confirmed present and unguarded on main @ f0fb1e1; line numbers may drift.
Description
Found via static analysis of prompt-cache anti-patterns (CacheLint), then confirmed by hand on
main@f0fb1e1.ProviderTransform.message()decides whether to callapplyCaching()using a hard-coded model-name gate (packages/opencode/src/provider/transform.ts:285-297):But
applyCaching()itself already defines cache directives for five providers, not just Anthropic (transform.ts:196-210):So a cacheable model served through
openrouter/openai-compatible/copilotwhose id contains neitherclaudenoranthropic(e.g. a GPT / Gemini / Qwen / Kimi routed through those providers) never entersapplyCaching()and never gets a cache breakpoint — even though the function clearly intends to cache it.This is an under-claim: caching silently fails to engage. It is not cache-busting, and Anthropic-named models are unaffected.
Impact
For an affected model, the system prefix (system prompt + tool schema + earlier turns) is re-sent at full input price on every turn instead of being read from cache. On long agentic loops that is roughly the usual cached-prefix discount forgone each turn, plus higher TTFB — for exactly the self-hosted / BYO-LLM users the project targets. The blast radius is bounded to non-Anthropic-named models that genuinely support explicit
cache_controlvia openrouter/openai-compatible/copilot.Steps to reproduce
openrouter(oropenai-compatible/copilot) whose id does not containclaude/anthropic(e.g. an OpenRouter-served model that honorscache_control).cacheControl/cache_controlprovider option is stamped on the system/last-user blocks (theapplyCachingbranch is skipped), so the prefix is billed as fresh input every turn.Suggested fix (for discussion)
Decouple the
applyCachinggate from the model-name list and drive it off an explicit capability/provider-support signal, so the gate matches the set of providersapplyCachingalready knows how to cache:capabilities.cachingflag on the model (thecapabilitiesschema inprovider.ts:787currently has no caching field), populated from the model registry; oranthropic / openrouter / bedrock / openaiCompatible / copilot(the same five keys already inproviderOptions).I want to flag the design angle rather than send a drive-by PR: this is a hot path, and #891 was deliberately deferred for the same "needs design + careful testing, don't regress gateway cache-hit rates" reason. It also overlaps with #891's goal of having a single source of truth for the cache-control gate. I'm happy to open a PR if a maintainer confirms the preferred shape (capability flag vs provider set) and that emitting these directives for non-Anthropic providers is intended.
Caveat: confirmed present and unguarded on
main@f0fb1e1; line numbers may drift.