Skip to content

Fix COVER tiny training set underflow#4693

Open
Rohithmatham12 wants to merge 1 commit into
facebook:devfrom
Rohithmatham12:fix-cover-tiny-training-set
Open

Fix COVER tiny training set underflow#4693
Rohithmatham12 wants to merge 1 commit into
facebook:devfrom
Rohithmatham12:fix-cover-tiny-training-set

Conversation

@Rohithmatham12

Copy link
Copy Markdown

Fixes #4682.

COVER_ctx_init() validates totalSamplesSize before computing suffixSize, but the subtraction uses trainingSamplesSize. With splitPoint < 1.0, a small training split can be smaller than MAX(d, sizeof(U64)) even when the total sample size passes validation. That lets the unsigned subtraction underflow and can lead to an oversized allocation.

This adds the missing trainingSamplesSize precondition before computing the partial suffix array size, and adds a regression for ZDICT_optimizeTrainFromBuffer_cover() with ten one-byte samples, auto-selected d/k, and splitPoint = 0.5. The API now returns srcSize_wrong instead of attempting the underflowing allocation.

Test plan:

  • make -j4 fuzzer
  • ./fuzzer -i1 -s1 -v
  • git diff --check

@meta-cla meta-cla Bot added the CLA Signed label Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

size_t underflow in COVER_ctx_init when ZDICT_optimizeTrainFromBuffer_cover invoked with params.d=0 leads to multi-EB malloc

1 participant