Render Read tool results with pygments via structured payload (closes #170) by cboos · Pull Request #172 · daaain/claude-code-log

cboos · 2026-05-25T08:29:49Z

Closes #170.

Summary

The Read tool's tool_result.content is cat -n formatted (each line prefixed with <line_number><TAB>), but the existing parser's regex only matched the arrow variant (<num>→<content>, used by Edit/Write result snippets). Read entries fell through to the generic ToolResultContent fallback — raw monospace text, no syntax highlighting, no lexer detection, no line-number alignment.

This PR fixes parse_read_output to:

Prefer toolUseResult.file when present. That field carries byte-clean content (no <num>\t prefix) plus accurate filePath, startLine, numLines, totalLines metadata. Avoids the lossy cat-n round-trip and is the only path that knows totalLines (needed to flag is_truncated correctly).
Extend the cat-n fallback regex from \s+(\d+)→(.*)$ to \s*(\d+)[\t→](.*)$ so older transcripts (without toolUseResult) still parse, and the existing arrow form for Edit/Write snippets is preserved.

Read is added to PARSERS_WITH_TOOL_USE_RESULT so the factory passes the structured payload through.

No changes to format_read_output / render_file_content_collapsible — the existing pygments machinery (highlight_code_with_pygments with extension-based lexer detection + linenostart=output.start_line) already does the right thing once ReadOutput.content is clean and start_line is correct.

Test plan

12 new regression tests in test/test_read_tool_pygments.py on a fixture built from the exact tool_use + tool_result pair in Fix Read tool rendering #170:
- Both parser paths (structured toolUseResult.file and cat-n text fallback)
- Tab separator (Read, Claude Code 2.1.x+) and arrow separator (Edit/Write — must not regress)
- HTML rendering with pygments python lexer, correct linenostart, file path in collapsible heading
- Edge cases: unknown extension → TextLexer fallback, single-line read, empty file, missing file_path
just ci clean (1998 tests / 0 ruff / 0 pyright / ty clean)
Snapshot tests pass without update — no existing snapshot fixture contained a Read tool result, so the parser change has no observable effect there
Manual sanity check: render a real transcript containing Read calls (e.g. the session screenshot from Fix Read tool rendering #170) and confirm syntax highlighting + correct starting line numbers appear

Note on PR #169 timing

If PR #169 (plugin system implementation) merges before this lands, the Read renderer may need to relocate under the new plugin system. The fix here is small and self-contained — relocating is mechanical. Happy to rebase as needed.

Summary by CodeRabbit

New Features
- Preserve structured file metadata for read results so parsed outputs retain exact file paths and line counts.
Bug Fixes / Improvements
- Accept multiple transcript line-separator formats (arrow and tab) for wider compatibility.
- Fix displayed line-range rendering to use inclusive start/end semantics and correct single-line wording.
Tests
- Add end-to-end tests covering structured parsing, fallback parsing, rendering, and edge cases (empty/single-line reads).

coderabbitai · 2026-05-25T08:29:59Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 80d822cc-5e54-4a05-b790-033c2569dfce

📥 Commits

Reviewing files that changed from the base of the PR and between 4e42fd0 and fa30a60.

📒 Files selected for processing (2)

claude_code_log/html/renderer.py
test/test_read_tool_pygments.py

📝 Walkthrough

Walkthrough

The PR enhances Read tool parsing to handle structured toolUseResult.file metadata while maintaining backward compatibility with legacy cat-n text format. The parser recognizes both tab and arrow separators, is registered to receive structured payloads, and is validated end-to-end with tests covering parsing paths, rendering, and edge cases.

Changes

Read tool structured parsing and rendering

Layer / File(s)	Summary
Parser registration and cat-n separator support `claude_code_log/factories/tool_factory.py`, `test/test_read_tool_pygments.py`	Regex updated to accept both `\t` and `→` separators when parsing cat-n snippet lines. `Read` parser added to `PARSERS_WITH_TOOL_USE_RESULT` registry so `create_tool_output` passes `tool_use_result` into `parse_read_output`. Registration test confirms wiring.
Read output parsing with structured and fallback paths `claude_code_log/factories/tool_factory.py`, `test/test_data/read_tool_pygments.jsonl`, `test/test_read_tool_pygments.py`	`parse_read_output` now accepts optional `tool_use_result`, extracts metadata and content from `toolUseResult.file` when present (filePath, startLine, numLines, totalLines, is_truncated), and falls back to cat-n regex parsing for older transcripts. Test fixture data and parsing tests cover structured preferred path, minimal structured fields, fallback with both separator variants, non-cat-n content rejection, and missing file_path handling.
Rendering and edge-case validation `claude_code_log/html/renderer.py`, `test/test_read_tool_pygments.py`	Title rendering for Read inputs now formats inclusive 1-based line ranges. End-to-end HTML rendering tests verify Python lexer highlighting, correct starting line presence, file path surfacing, and absence of leaked raw cat-n `"<num>\t"` prefixes. Tests also cover unknown-extension fallback, single-line reads, empty-file behavior (`numLines == 0`), and missing numLines fallback using `splitlines()`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 The Read tool now sees structured light,
Both tab and arrow separators bright,
Metadata flows where Pygments can shine,
Edge cases handled, each line by line.
From claude-code-log, a rendering delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly matches the main objective: implementing pygments rendering for Read tool results via structured payload, as confirmed by issue `#170`.
Linked Issues check	✅ Passed	All requirements from `#170` are implemented: parsing structured toolUseResult.file for clean content and metadata, supporting both tab and arrow separators in cat-n fallback, adding Read to PARSERS_WITH_TOOL_USE_RESULT, and enabling pygments rendering with correct line numbers and lexer detection.
Out of Scope Changes check	✅ Passed	All changes are directly related to `#170`: parser updates for structured payloads, cat-n regex expansion, registry adjustment, title rendering logic, test data, and comprehensive test coverage. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dev/issue-170-read-tool-pygments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The Read tool's `tool_result.content` is `cat -n` formatted (each line prefixed with `<line_number><TAB>`), and the existing parser's regex only matched the arrow variant (`<num>→<content>`, used by Edit/Write result snippets). Read entries fell through to the generic ToolResultContent fallback — rendered as raw monospace text with no syntax highlighting, no lexer detection, no line-number alignment. Two-pronged fix in `parse_read_output`: 1. Preferred path: consume `toolUseResult.file` directly when present. That field carries byte-clean content (no `<num>\t` prefix) plus accurate `filePath`, `startLine`, `numLines`, `totalLines` metadata. Avoids the lossy cat-n round-trip and is the only path that knows `totalLines` (needed for the truncation flag). 2. Fallback path: extend `_parse_cat_n_snippet`'s regex from `\s+(\d+)→(.*)$` to `\s*(\d+)[\t→](.*)$` so it accepts both separator variants. Read on older transcripts (no `toolUseResult`) still parses correctly; Edit/Write arrow form is preserved. Register "Read" in `PARSERS_WITH_TOOL_USE_RESULT` so the factory passes the structured payload through. No changes to `format_read_output` / `render_file_content_collapsible` — the existing pygments machinery (`highlight_code_with_pygments` + `linenostart=output.start_line`) already does the right thing once `ReadOutput.content` is clean and `start_line` is correct. 12 new regression tests on `read_tool_pygments.jsonl` (built from the exact tool_use + tool_result pair in issue #170) cover both parser paths, lexer detection, line-number alignment, and edge cases (unknown extension, single-line, empty file, missing file_path).

The ``int(file_info.get("numLines") or default)`` shortcut in the structured-payload branch silently promoted ``numLines == 0`` (an empty-file read) to the absent-fallback, which evaluated to ``len("".split("\n")) == 1`` — rendering an empty file as 1 line. Same hazard on ``totalLines == 0`` and (latent) ``startLine == 0``. Distinguish *absent* from *zero* explicitly via ``is not None``. Also switch the absent-numLines fallback from ``split("\n")`` to ``splitlines()`` so content ending in ``\n`` (most file content) does not tack on a phantom trailing element ("x\ny\n".splitlines() → ["x", "y"], length 2 not 3). Two new regression tests: - ``test_empty_file`` extended to assert ``num_lines == 0`` / ``total_lines == 0`` (previously passed for the wrong reason — ``is_truncated is False`` because both got promoted to 1). - ``test_absent_numlines_uses_splitlines`` exercises the absent- numLines fallback on content with a trailing newline.

User manually tested PR #172 and surfaced a pre-existing off-by-one in the Read tool's HTML title: with input ``offset=775, limit=20`` the title rendered "lines 776-795" while the actual content (correctly) showed lines 775-794. Two compounding off-by-ones in the title generation: - The start was shifted by ``+1`` (``f"line {offset + 1}"``) on the assumption that ``offset`` was 0-based. It is not — ``offset`` is the 1-based starting line number, matching what ``toolUseResult.file.startLine`` reports and what the rendered cat-n content shows. - The end was computed as ``offset + limit`` (exclusive) when it should have been ``offset + limit - 1`` (inclusive — both endpoints are shown in the rendered content). Net effect: title now agrees with the rendered content's line numbering. The ``offset=0 / None`` case (read from start of file) is preserved by treating any falsy ``offset`` as a display value of 1, which matches the actual behaviour of the Read tool when offset is absent (content starts at line 1). Markdown side has no line-range in its Read title (only filename), so no parallel fix needed. Regression test on the existing fixture asserts the title contains ``"lines 775-794"`` and not ``"lines 776-795"``.

cboos force-pushed the dev/issue-170-read-tool-pygments branch from beda168 to 2ae70c1 Compare May 25, 2026 08:41

cboos marked this pull request as ready for review May 25, 2026 08:48

cboos merged commit 4ff8e96 into main May 25, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Render Read tool results with pygments via structured payload (closes #170)#172

Render Read tool results with pygments via structured payload (closes #170)#172
cboos merged 3 commits into
mainfrom
dev/issue-170-read-tool-pygments

cboos commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cboos commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Note on PR #169 timing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cboos commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading