tee: don't treat a short read as end-of-file#11755
Closed
kevinburke wants to merge 1 commit intouutils:mainfrom
Closed
tee: don't treat a short read as end-of-file#11755kevinburke wants to merge 1 commit intouutils:mainfrom
kevinburke wants to merge 1 commit intouutils:mainfrom
Conversation
Commit 13fb3be ("tee: increase buf size for large input") split copy() into a one-shot first read followed by a loop over a larger buffer. The first branch returned early if `read(2)` returned fewer bytes than FIRST_BUF_SIZE: if bytes_count < FIRST_BUF_SIZE { output.flush()?; return Ok(len); } A short read is not EOF. POSIX `read(2)` only returns 0 at end-of-file; a return of N < buflen just means "that's what was available without blocking". Any pipeline where the upstream writer pauses between writes — a slow producer, a `sleep` in a shell pipeline, a systemd service emitting log lines in bursts — will see a short read on the very first call and will be cut off after one chunk. Downstream this manifests as the writer dying with SIGPIPE (exit 141) on its next write, because tee has already closed its end of the pipe. We hit this with a Go program writing structured logs through `2>&1 | tee -a /var/log/foo.log` in a systemd oneshot: the first slog line made it through, tee exited, and the next write killed the Go process before it could do any work. Keep the two-buffer optimization — it's useful for large inputs — but gate the upgrade to the larger buffer on actually seeing a full-sized read, and keep looping on the small buffer until we do. Only `read == 0` terminates the loop. Add a regression test that writes two chunks separated by a delay, drains the first one from tee's stdout so we know tee has processed its first read/write cycle, then writes the second chunk and asserts it makes it through to both stdout and the output file. Without this fix the second `write_in` fails with "Broken pipe (os error 32)".
oech3
reviewed
Apr 11, 2026
| output.flush()?; | ||
| return Ok(len); | ||
| len += bytes_count; | ||
| if bytes_count == FIRST_BUF_SIZE { |
Contributor
There was a problem hiding this comment.
I think == is too strict and difficult to switch to code path for large file.
|
GNU testsuite comparison: |
Contributor
Author
|
This is also fixed by #11686, closing. |
Contributor
|
merged. You can open a PR for pure-Rust test for more targets. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit 13fb3be ("tee: increase buf size for large input") split copy() into a one-shot first read followed by a loop over a larger buffer. The first branch returned early if
read(2)returned fewer bytes than FIRST_BUF_SIZE:A short read is not EOF. POSIX
read(2)only returns 0 at end-of-file; a return of N < buflen just means "that's what was available without blocking". Any pipeline where the upstream writer pauses between writes — a slow producer, asleepin a shell pipeline, a systemd service emitting log lines in bursts — will see a short read on the very first call and will be cut off after one chunk.Downstream this manifests as the writer dying with SIGPIPE (exit 141) on its next write, because tee has already closed its end of the pipe. We hit this with a Go program writing structured logs through
2>&1 | tee -a /var/log/foo.login a systemd oneshot: the first slog line made it through, tee exited, and the next write killed the Go process before it could do any work.Keep the two-buffer optimization — it's useful for large inputs — but gate the upgrade to the larger buffer on actually seeing a full-sized read, and keep looping on the small buffer until we do. Only
read == 0terminates the loop.Add a regression test that writes two chunks separated by a delay, drains the first one from tee's stdout so we know tee has processed its first read/write cycle, then writes the second chunk and asserts it makes it through to both stdout and the output file. Without this fix the second
write_infails with "Broken pipe (os error 32)".