Ai SDK updates, stability and quality fine tune by abose · Pull Request #2916 · phcode-dev/phoenix

abose · 2026-05-15T14:04:59Z

No description provided.

The JS SDK was split out of @anthropic-ai/claude-code in v2 and moved to @anthropic-ai/claude-agent-sdk. The CLI binary still ships under @anthropic-ai/claude-code (used by Phoenix's terminal "claude" command); the SDK now lives in its own package with the same query() signature and SDKResultMessage shape, so the rest of claude-code-agent.js is unchanged beyond the import swap. - @anthropic-ai/claude-code: ^1.0.0 → ^2.1.118 - @anthropic-ai/claude-agent-sdk: new ^0.2.126 - zod: ^3.25.76 → ^4.0.0 Lock files regenerated by npm install.

Two complaints about the takeScreenshot tool: 1. Claude was sometimes shooting the whole editor window when the user obviously meant "how does the rendered preview look". 2. The user-visible label said "Screenshot of full page" which was misleading — the default capture is the entire Phoenix editor app window (toolbar + sidebar + code area + panels), not a rendered web page. Tightening on both fronts: - mcp-editor-tools.js: tool description now starts "ALMOST ALWAYS pass a selector — capturing the full editor returns a busy image full of editor chrome that's hard to reason about" and explicitly lists the trigger phrases ("how does it look", "is the page rendering", "check the preview") that should fire #panel-live-preview-frame. Selector schema unchanged — the IDs were already documented, the recommendation just had to be louder. - claude-code-agent.js system prompt: same trigger-phrase nudge plus a last-resort line: when the user asks about something in the editor you can't identify from getEditorState, take a screenshot (no selector) to see what they're looking at. - strings.js: rename AI_CHAT_TOOL_SCREENSHOT_FULL_PAGE to AI_CHAT_TOOL_SCREENSHOT_FULL_EDITOR with value "the full editor" so the chat bubble reads "Screenshot of the full editor" instead of the misleading "Screenshot of full page".

execJsInLivePreview was in the "no timeout" bucket on the rationale that user-supplied JS can legitimately take a while. The downside: when the live preview is wedged or just slow to settle, the agent hangs indefinitely on "Inspecting preview" with no way to recover. Add a timeoutMs parameter to the tool. The model picks a value that fits the snippet it's running; the call is raced against the timeout and either resolves or surfaces a deterministic timeout error so the agent can move on. - _execPeerWithTimeout now accepts an optional overrideMs that wins over the static EXEC_PEER_TIMEOUT_MS map. - _resolveCallerTimeout floors at 5000ms (no point picking a tighter timeout — the preview frame may still be settling). No upper limit; a user can legitimately request a long-running inspection. - Default 10000ms when timeoutMs is omitted.

…rome-devtools Two adjacent failure modes the previous prompt didn't catch: 1. When the user asked editor-context questions ("can you see the open file", "in this page", "on the screen"), Claude sometimes answered from generic Claude knowledge — "No, I don't have visibility into your editor" — instead of calling getEditorState. 2. When Claude DID decide it needed to see the page, it sometimes reached for the user's chrome-devtools MCP (which opens a fresh separate browser session) instead of phoenix-editor.takeScreenshot (which captures the live preview inside Phoenix, reflecting the user's actual unsaved edits). Add an upfront block to the system prompt that: - Explicitly anti-patterns the "I can't see..." refusal — call getEditorState / takeScreenshot / execJsInLivePreview instead. - Explicitly steers ALL preview interactions (screenshots, JS eval, DOM inspection, viewport resize, reload) to the phoenix-editor MCP rather than chrome-devtools or other browser MCPs. Only fall back to a non-Phoenix browser context when the user explicitly asks.

…itorState fallback hint Bundled set of editor-tool refinements driven by support traces: - getEditorState now reports inDesignMode (true when the code editor is hidden and the live preview is expanded full-bleed). The model needs this to interpret "I can't see your code edits" cases — design mode hides the code view, and that's a separate fact from the active file. - getEditorState response now appends a fallback hint pointing the model at takeScreenshot (no selector) for any UI question the JSON doesn't answer (e.g. "what's in the Problems panel" / "what does this sidebar say" — UI panels not represented in the state object). - takeScreenshot description rewritten to a clean binary rule — "rendered live preview" → selector='#panel-live-preview-frame', "anything else (Problems panel, file tree, toolbar, any Phoenix UI)" → no selector. Earlier "ALMOST ALWAYS pass a selector" framing was over-discouraging the legit no-selector case where the user is asking about Phoenix UI. - controlEditor gains a toggleDesignMode operation with an `enabled` boolean. The op description spells out what design mode is (full live preview, code editor hidden, content-focused browser-like view) so the model picks it for the right intents. - System prompt: explains what the live preview actually is (the rendered view of the active HTML/CSS/JS/SVG/Markdown file), so Claude has a mental model before reading the per-tool sections.

sonarqubecloud · 2026-05-15T16:42:09Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

abose added 7 commits May 15, 2026 18:52

chore: update pro deps

c2cf5fd

chore: update pro deps

3503233

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ai SDK updates, stability and quality fine tune#2916

Ai SDK updates, stability and quality fine tune#2916
abose wants to merge 7 commits into
mainfrom
ai

abose commented May 15, 2026

Uh oh!

sonarqubecloud Bot commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

abose commented May 15, 2026

Uh oh!

sonarqubecloud Bot commented May 15, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant