fix stale output capacity in Lz4BlockDecompressor decompress#64091
Open
sahvx655-wq wants to merge 1 commit into
Open
fix stale output capacity in Lz4BlockDecompressor decompress#64091sahvx655-wq wants to merge 1 commit into
sahvx655-wq wants to merge 1 commit into
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
reading the lz4 block path in decompressor.cpp: remaining_output_len is computed once per large block and then handed to LZ4_decompress_safe as the destination capacity for every small block inside it. output_ptr advances by each small block's decompressed length, but that capacity never moves with it, so from the second small block on the decompress is told it has the full large-block space starting at an already advanced pointer. a crafted lz4block stream (for instance a csv load) can then write past the line reader output buffer, a heap out-of-bounds write.
the fix passes the true remaining capacity measured from the current output_ptr: output_max_len - (output_ptr - output).
problem fixed: heap out-of-bounds write in Lz4BlockDecompressor::decompress. the inner small-block loop passed a stale dstCapacity (remaining_output_len, fixed per large block) to LZ4_decompress_safe while output_ptr kept advancing, so later small blocks could be decompressed past the output buffer. fixed by computing the capacity relative to the current output_ptr each iteration.
behaviour modified: before, the second and later small blocks within a large block were given the full large-block capacity even though output_ptr had already moved forward, which over-states the space and allows an overflow. now the capacity tracks output_ptr, so a small block that would not fit makes LZ4_decompress_safe return an error and decompress returns InvalidArgument instead of overflowing. impact is limited to this bounds check; well-formed streams that already fit are unaffected.
no new feature.
no refactor. the change is a single argument to LZ4_decompress_safe plus a comment.
no optimisation.