fix(arrow-ipc): bound MessageReader allocations by actual stream bytes#9869
fix(arrow-ipc): bound MessageReader allocations by actual stream bytes#9869masumi-ryugo wants to merge 1 commit intoapache:mainfrom
Conversation
`MessageReader::maybe_next` decoded the on-wire `meta_len` (after the
round-1 check that rejects negative values) and the FlatBuffer message's
`bodyLength` and used both directly for up-front allocations:
self.buf.resize(meta_len, 0); // <— attacker-controlled
let mut buf = MutableBuffer::from_len_zeroed(message.bodyLength() as usize);
A 4-byte input — `00 1b 00 48` — claims a ~1.2 GiB metadata payload via
`meta_len = i32::from_le_bytes(...) = 0x4800_1b00`, driving a 1.2 GiB
`Vec::resize` before any short-read could fail. ~3×10^8 amplification
factor from input bytes to allocation; OOM-kills the process on a
2 GB-rss-limited fuzzer.
Read both metadata and body via incremental reads tied to the bytes
actually delivered by the underlying `Read`:
* metadata uses `take(meta_len).read_to_end(&mut self.buf)` followed
by an explicit length check;
* body is filled by a new `read_body_into_buffer` helper that
`extend_from_slice`s 64 KiB chunks into a `MutableBuffer`,
preserving the cache-line-aligned allocation that downstream
Arrow consumers rely on while keeping the high-water-mark
proportional to the bytes actually received.
Add `bodyLength` validation (`usize::try_from`) so a negative i64 is
surfaced as a `ParseError` instead of wrapping into a huge `usize`.
Add a regression test (`test_stream_reader_huge_meta_len_does_not_oom`)
that feeds the 4-byte fuzzer repro through `StreamReader::try_new` and
asserts a clean `Err`.
Found via cargo-fuzz libFuzzer harness wrapping `StreamReader::try_new`.
|
Independent confirmation from a fresh cargo-fuzz harness on this same code path — flagging this here because it's exactly the kind of evidence #5332 was set up to produce. Setup: I added a Pre-fix (current
Post-fix (this PR's code):
So this PR cleanly defuses the entire class of "single u32 in the header drives a multi-GB allocation" for |
|
run benchmark ipc_reader ipc_writer |
|
🤖 Arrow criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing fix/arrow-ipc-body-length-bounded (c36a092) to fd86c75 (merge-base) diff File an issue against this benchmark runner |
|
🤖 Arrow criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing fix/arrow-ipc-body-length-bounded (c36a092) to fd86c75 (merge-base) diff File an issue against this benchmark runner |
|
🤖 Arrow criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
|
🤖 Arrow criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
|
This implementation seems to be quite a bit slower |
Which issue does this PR close?
N/A — found via
cargo-fuzzlibFuzzer harness overStreamReader::try_new.Happy to file a tracking issue first if maintainers prefer.
Rationale for this change
MessageReader::maybe_nextdecodes the on-wiremeta_len(after theexisting check that rejects negative values) and the FlatBuffer message's
bodyLength, and uses both directly for up-front allocations:A 4-byte input —
00 1b 00 48— claims a ~1.2 GiB metadata payload viameta_len = i32::from_le_bytes(...) = 0x4800_1b00, driving a 1.2 GiBVec::resizebefore any short-read could fail. Roughly a 3×10⁸amplification factor from input bytes to allocation; OOM-kills the
process on a 2 GiB-rss-limited fuzzer or a memory-constrained service.
Per
SECURITY.mdthis is a bug, not a vulnerability (no RCE, noinformation disclosure — only availability), but it is reachable from
the public
StreamReader::try_newentrypoint and is the same shape ofbug that the recent
meta_len-negative fix addressed.What changes are included in this PR?
(&mut self.reader).take(meta_len).read_to_end(&mut self.buf)followed by an explicit length check, so the buffer grows as bytes
actually arrive instead of being eagerly resized to the (untrusted)
declared length.
read_body_into_bufferhelper that fills aMutableBufferin64 KiB chunks via
extend_from_slice. This preserves the cache-linealigned allocation that downstream Arrow consumers rely on, while
keeping the high-water-mark allocation proportional to the bytes
actually delivered by the underlying reader.
bodyLength(i64) viausize::try_from, surfacing anegative or out-of-range value as a
ParseErrorinstead of wrappingsilently into a huge
usizeon 64-bit and a different hugeusizeon 32-bit.
test_stream_reader_huge_meta_len_does_not_oom,that runs the 4-byte fuzzer repro through
StreamReader::try_newandasserts a clean
Err.Are these changes tested?
Yes — new unit test as above.
cargo test -p arrow-ipc --release(112 unit + 17 integration tests),
cargo clippy -p arrow-ipc --all-targets -- -D warnings, andcargo fmt --checkare all clean.Are there any user-facing changes?
Yes — malformed IPC streams that previously triggered a multi-GB
allocation now return an
ArrowError::ParseErrorearly. No behaviorchange for well-formed streams; allocation is still cache-line aligned
and the final buffer shape (
MutableBuffer) is unchanged.Reproducer
4 bytes:
Before this PR: a ~1.2 GiB allocation completes (or OOM-kills the
process under a memory limit) before
read_exactdiscovers there are0 bytes left and returns an EOF error.
After this PR:
Err(ArrowError::ParseError("Unexpected EOF reading 1207975168 bytes of message metadata, got 0")), with peak allocationon the order of the 64 KiB read chunk plus a small flatbuffer scratch.
Found via
cargo-fuzzlibFuzzer harness wrappingStreamReader::try_new.