Skip to content

feat(usage): flag truncated/refused/failed finish_reasons as has_error#3071

Merged
chrarnoldus merged 7 commits intomainfrom
feat/finish-reason-error-classification
May 7, 2026
Merged

feat(usage): flag truncated/refused/failed finish_reasons as has_error#3071
chrarnoldus merged 7 commits intomainfrom
feat/finish-reason-error-classification

Conversation

@kilo-code-bot
Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot commented May 6, 2026

Summary

  • New apps/web/src/lib/ai-gateway/finishReason.ts module with two const arrays (NON_ERROR_FINISH_REASONS, ERROR_FINISH_REASONS) covering every distinct finish_reason / stop_reason / Responses API status value observed in production microdollar_usage logs (OpenAI & OpenRouter chat completions, Vercel AI SDK, Anthropic Messages, OpenAI Responses), plus an isErrorFinishReason() helper.
  • Wires isErrorFinishReason(finish_reason) into the hasError calculation of the OpenRouter chat completions stream + string parsers (processUsage.ts), the Anthropic Messages stream + string parsers (processUsage.messages.ts), and the OpenAI Responses stream parser (processUsage.responses.ts). The Responses API string parser's status !== 'completed' rule is left alone — for a non-stream body it is intentionally stricter (flags null/missing status as an error, which isErrorFinishReason does not).
  • Net effect: a 200 OK response that ends with stop_reason: "refusal", finish_reason: "length" / "content_filter" / "failed" / "error" etc. is now recorded with has_error: true in microdollar_usage instead of silently logged as a success. Unrecognised provider strings, unknown, and other stay non-error so novel upstream values don't spike error rates.

Verification

  • Traced each parser path (stream and string) to confirm finish_reason flows from the SSE event / response body into coreProps.hasError for both error and non-error values.
  • Reviewed existing test fixtures (finish_reason: 'stop', approved snapshots with 'end_turn' and 'completed') — all classified as non-error, so existing assertions are unaffected.

Visual Changes

N/A

Reviewer Notes

  • length and max_tokens are classified as errors (truncated output is something product/customers usually want surfaced). If you'd rather keep truncation as success, move those two entries from ERROR_FINISH_REASONS to NON_ERROR_FINISH_REASONS and the classifier test still passes unchanged.
  • unknown and other are non-errors on purpose so brand-new upstream values don't immediately spike error rates; pair with a future Sentry warning on values outside both lists if we want visibility on novel ones.
  • Skipped pnpm typecheck / pnpm test / pnpm format per the request not to run long-running tasks before pushing — please rely on CI.

…has_error

Introduce a shared zod enum + helper (isErrorFinishReason) for the set of
finish_reason / stop_reason / status values we observe across OpenAI chat
completions, OpenRouter, Anthropic Messages API, OpenAI Responses API, and
Vercel AI SDK style responses.

Reasons that indicate truncation, refusal, content filtering, upstream
failure, or an interrupted in_progress stream now flip has_error to true
in all three usage parsers (chat completions, messages, responses string
path). Normal completion reasons (stop, end_turn, tool_use, completed,
stop_sequence, tool_calls/tool-calls) and unclassified catch-alls
(unknown, other, null) keep has_error driven only by status code and
abort signals as before.
@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented May 6, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (8 files)
  • apps/web/src/app/api/openrouter/[...path]/route.ts
  • apps/web/src/lib/ai-gateway/determineFallbackFeature.ts
  • apps/web/src/lib/ai-gateway/extractPromptInfo.ts
  • apps/web/src/lib/ai-gateway/finishReason.ts
  • apps/web/src/lib/ai-gateway/processUsage.ts
  • services/cloud-agent-next/src/router/handlers/session-prepare.ts
  • services/cloud-agent-next/src/session-prepare.test.ts
  • services/cloud-agent-next/src/workspace.ts

Reviewed by gpt-5.5-2026-04-23 · 1,965,868 tokens

kilo-code-bot Bot added 3 commits May 7, 2026 09:36
The zod schema had no runtime consumer — isErrorFinishReason uses a
plain Set and the pipeline intentionally keeps finish_reason: string |
null end-to-end so unknown upstream values flow through unchanged.
Keeping the const arrays gives us the same type-level safety without
adding a zod dependency nobody calls.
@kilo-code-bot kilo-code-bot Bot changed the title feat(usage): classify finish_reason values and flag error reasons in has_error feat(usage): flag truncated/refused/failed finish_reasons as has_error May 7, 2026
kilo-code-bot Bot added 3 commits May 7, 2026 11:51
The comment was mentioning specific provider APIs, which is noise when
the list is sourced from production log distinct values. Keep the comment
focused on where the data comes from and the unknown/other policy.
@chrarnoldus chrarnoldus merged commit bcf1cec into main May 7, 2026
13 checks passed
@chrarnoldus chrarnoldus deleted the feat/finish-reason-error-classification branch May 7, 2026 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants