Skip to content

Honor AI_AGENT and pass raw values through#1454

Open
renaudhartert-db wants to merge 1 commit into
mainfrom
ai-agent-env-var
Open

Honor AI_AGENT and pass raw values through#1454
renaudhartert-db wants to merge 1 commit into
mainfrom
ai-agent-env-var

Conversation

@renaudhartert-db
Copy link
Copy Markdown
Contributor

@renaudhartert-db renaudhartert-db commented May 30, 2026

Why

The Python SDK detects AI coding agents and surfaces them as agent/<name> in the User-Agent. Today the generic fallback (when no proprietary env var fires) only honors the agents.md AGENT=<name> standard. Vercel's @vercel/detect-agent library uses a parallel AI_AGENT=<name> convention that tools in the Vercel ecosystem set instead; we currently miss those.

Separately, the existing fallback coerces any unrecognized value to the literal string "unknown". That buries useful signal: a tool setting AI_AGENT=claude-code_2-1-141_agent ends up as agent/unknown, discarding the very signal (tool name plus version variant) we want to see. Bucketing arbitrary names is an ETL concern, not the SDK's.

This mirrors the Go SDK change in databricks/databricks-sdk-go#1683.

Changes

Two behavior changes in databricks/sdk/useragent.py:

  1. AI_AGENT fallback. Add AI_AGENT=<name> as a secondary fallback after AGENT=<name>. AGENT wins when both are set to non-empty values; empty is treated as unset for both. Explicit product matchers (e.g. CLAUDECODE) still always win over both.

  2. Raw passthrough instead of "unknown". Drop the known-product lookup in the fallback. The value is sanitized (disallowed chars become -, satisfying the User-Agent allowlist [0-9A-Za-z_.+-]+) and capped at 64 chars to keep the header bounded. Known products like cursor or claude-code pass through unchanged because they already satisfy the allowlist.

Same change is landing in databricks-sdk-java as a sibling PR.

Test plan

  • pytest tests/test_user_agent.py passes (54 tests)
  • ruff format / ruff check clean
  • AI_AGENT=<known product> returns the product name
  • AI_AGENT=<unrecognized> returns the raw sanitized value (no longer "unknown")
  • AGENT wins over AI_AGENT when both are non-empty
  • Empty AGENT falls through to AI_AGENT
  • Disallowed chars in AGENT / AI_AGENT are sanitized to -
  • Values longer than 64 chars are truncated
  • Explicit matcher (e.g. CLAUDECODE) still wins over both fallbacks

@renaudhartert-db renaudhartert-db changed the title Add AI_AGENT fallback and sanitized passthrough for agent detection Honor AI_AGENT and pass raw values through May 30, 2026
Mirror databricks-sdk-go PR #1683 in the User-Agent agent detection.

Add AI_AGENT (the Vercel @vercel/detect-agent convention) as a secondary
fallback env var, consulted only when AGENT (the agents.md standard) is
unset or empty. AGENT takes precedence when both are non-empty. Explicit
product-specific env vars (CLAUDECODE, CURSOR_AGENT, etc.) still win over
both.

Change the fallback behavior so an unrecognized value is passed through
rather than coerced to "unknown". The raw value is sanitized to the
User-Agent allowlist (disallowed characters become "-") and capped at 64
characters. This applies to both AGENT and AI_AGENT.
@github-actions
Copy link
Copy Markdown

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1454
  • Commit SHA: 6fd4e30686b6fb1babb18e1243c700c97f5aeba6

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant