Skip to content

fix: handle binary files in agent_toolset read tool#1727

Open
okxint wants to merge 2 commits into
anthropics:mainfrom
okxint:fix/agent-toolset-binary-read
Open

fix: handle binary files in agent_toolset read tool#1727
okxint wants to merge 2 commits into
anthropics:mainfrom
okxint:fix/agent-toolset-binary-read

Conversation

@okxint

@okxint okxint commented Jul 2, 2026

Copy link
Copy Markdown

Summary

beta_read_tool calls target.read_text() unconditionally. Binary files (JPEG, PNG, GIF, WebP, PDF) raise UnicodeDecodeError at that call, which surfaces to the model as a raw tool error — even though the tool-result type already supports image and document content blocks.

This bites any self-hosted CMA agent that needs to read an image or PDF: rendered slides, screenshots, attached documents. The hosted product and Claude Code both handle binary reads; this brings the self-hosted toolset in line.

Fix

Detect binary files by extension before attempting read_text(). For known image and PDF suffixes, read the raw bytes and return a base64 content block:

  • .jpg/.jpeg/.png/.gif/.webp{"type": "image", "source": {"type": "base64", "media_type": "...", "data": "..."}}
  • .pdf{"type": "document", "source": {"type": "base64", "media_type": "application/pdf", "data": "..."}}

Binary files get their own size cap (DEFAULT_MAX_BINARY_BYTES = 5 MiB, separate from the 256 KiB text cap) since images regularly exceed the text limit. view_range is rejected for binary files with an explicit error.

Text files and unknown extensions go through the existing read_text() path unchanged.

Tests

  • Parametrized across all 6 supported extensions/formats — verifies the correct type, media_type, and round-tripped base64 payload
  • view_range rejection for binary files
  • Oversized binary file rejection

Fixes #1637

okxint added 2 commits July 2, 2026 11:14
beta_read_tool called target.read_text() unconditionally, which raises
UnicodeDecodeError on binary files (images, PDFs). The error surfaced to
the model as a raw tool failure even though the tool-result type already
supports image/document content blocks.

For known image extensions (.jpg, .jpeg, .png, .gif, .webp) and .pdf,
read the raw bytes and return a base64-encoded content block instead:
- images -> {"type": "image", "source": {"type": "base64", ...}}
- PDFs   -> {"type": "document", "source": {"type": "base64", ...}}

Binary files get their own size cap (DEFAULT_MAX_BINARY_BYTES = 5 MiB)
since images routinely exceed the 256 KiB text limit. view_range is
rejected for binary files with a clear error message.

Fixes anthropics#1637
@okxint okxint requested a review from a team as a code owner July 2, 2026 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Self-hosted agent_toolset read tool raises UnicodeDecodeError on binary files (images/PDFs) instead of returning content blocks

1 participant