fix: handle binary files in agent_toolset read tool#1727
Open
okxint wants to merge 2 commits into
Open
Conversation
beta_read_tool called target.read_text() unconditionally, which raises
UnicodeDecodeError on binary files (images, PDFs). The error surfaced to
the model as a raw tool failure even though the tool-result type already
supports image/document content blocks.
For known image extensions (.jpg, .jpeg, .png, .gif, .webp) and .pdf,
read the raw bytes and return a base64-encoded content block instead:
- images -> {"type": "image", "source": {"type": "base64", ...}}
- PDFs -> {"type": "document", "source": {"type": "base64", ...}}
Binary files get their own size cap (DEFAULT_MAX_BINARY_BYTES = 5 MiB)
since images routinely exceed the 256 KiB text limit. view_range is
rejected for binary files with a clear error message.
Fixes anthropics#1637
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
beta_read_toolcallstarget.read_text()unconditionally. Binary files (JPEG, PNG, GIF, WebP, PDF) raiseUnicodeDecodeErrorat that call, which surfaces to the model as a raw tool error — even though the tool-result type already supportsimageanddocumentcontent blocks.This bites any self-hosted CMA agent that needs to read an image or PDF: rendered slides, screenshots, attached documents. The hosted product and Claude Code both handle binary reads; this brings the self-hosted toolset in line.
Fix
Detect binary files by extension before attempting
read_text(). For known image and PDF suffixes, read the raw bytes and return a base64 content block:.jpg/.jpeg/.png/.gif/.webp→{"type": "image", "source": {"type": "base64", "media_type": "...", "data": "..."}}.pdf→{"type": "document", "source": {"type": "base64", "media_type": "application/pdf", "data": "..."}}Binary files get their own size cap (
DEFAULT_MAX_BINARY_BYTES = 5 MiB, separate from the 256 KiB text cap) since images regularly exceed the text limit.view_rangeis rejected for binary files with an explicit error.Text files and unknown extensions go through the existing
read_text()path unchanged.Tests
type,media_type, and round-tripped base64 payloadview_rangerejection for binary filesFixes #1637