Critical: fix stack overflow from unbounded sprintf()#2410
Critical: fix stack overflow from unbounded sprintf()#2410sfc-gh-dachristensen wants to merge 1 commit intoapache:masterfrom
Conversation
The %f format specifier for doubles can produce over 300 characters (e.g., DBL_MAX formatted with %f). The 64-byte stack buffer is insufficient. This is a stack buffer overflow exploitable via crafted agtype float values. Fix this by both bumping the buffer size (to allow full precision) and changing all sprintf() call sites to be snprintf() calls instead. Signed-off-by: David Christensen <david.christensen@snowflake.com>
|
@sfc-gh-dachristensen Please add the following -
|
There was a problem hiding this comment.
Pull request overview
This PR addresses a security issue in agtype list conversion functions by removing unbounded string formatting on stack buffers when converting numeric values, preventing potential stack buffer overflows from crafted float inputs.
Changes:
- Increased the stack buffer size used during
toFloatList()float formatting. - Replaced
sprintf()withsnprintf()intoFloatList()andtoStringList()conversions.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| float_num = elem->val.float_value; | ||
| sprintf(buffer, "%f", float_num); | ||
| snprintf(buffer, sizeof(buffer), "%f", float_num); | ||
| string = buffer; | ||
| float_elem.val.float_value = float8in_internal_null(string, NULL, "double precision", string, &is_valid); |
There was a problem hiding this comment.
The PR description mentions increasing the buffer to "allow full precision", but this code still formats with "%f" (default precision 6), which both loses precision and expands large values (e.g., DBL_MAX) into hundreds of digits. Consider avoiding fixed-point formatting here by using Postgres' float8out output (or a "%.*g" with DBL_DECIMAL_DIG) so the conversion is round-trippable without requiring such a large stack buffer.
| bool is_valid = false; | ||
| float8 float_num; | ||
| char buffer[64]; | ||
| char buffer[400]; |
There was a problem hiding this comment.
char buffer[400]; is a new magic number on the stack. Since the required size is derived from DBL_MAX_10_EXP and the chosen formatting/precision, consider defining a named constant (or computing the bound) so it's clear why 400 is sufficient and future format changes don't accidentally reintroduce truncation/overflow risks.
| case AGTV_INTEGER: | ||
|
|
||
| sprintf(buffer, "%ld", elem->val.int_value); | ||
| snprintf(buffer, sizeof(buffer), "%ld", elem->val.int_value); |
There was a problem hiding this comment.
elem->val.int_value is an int64 (see struct agtype_value), but the format string uses %ld, which is not portable (e.g., long is 32-bit on some platforms) and can lead to incorrect output/UB. Use INT64_FORMAT (or pg_lltoa / int8out) when formatting int64 values.
| snprintf(buffer, sizeof(buffer), "%ld", elem->val.int_value); | |
| snprintf(buffer, sizeof(buffer), INT64_FORMAT, elem->val.int_value); |
| snprintf(buffer, sizeof(buffer), "%f", float_num); | ||
| string = buffer; |
There was a problem hiding this comment.
This change fixes a security-sensitive crash vector (very large float formatted via %f). There are existing regression tests for toFloatList() in regress/sql/expr.sql, but none appear to cover extremely large float values (e.g., something near DBL_MAX). Adding a regression case that exercises toFloatList([<very large float>]) would help prevent reintroducing the overflow/truncation issue.
The %f format specifier for doubles can produce over 300 characters (e.g., DBL_MAX formatted with %f). The 64-byte stack buffer is insufficient. This is a stack buffer overflow exploitable via crafted agtype float values.
Fix this by both bumping the buffer size (to allow full precision) and changing all sprintf() call sites to be snprintf() calls instead.