Skip to content

Critical: fix stack overflow from unbounded sprintf()#2410

Open
sfc-gh-dachristensen wants to merge 1 commit intoapache:masterfrom
sfc-gh-dachristensen:fix/critical-stack-overflow-sprintf
Open

Critical: fix stack overflow from unbounded sprintf()#2410
sfc-gh-dachristensen wants to merge 1 commit intoapache:masterfrom
sfc-gh-dachristensen:fix/critical-stack-overflow-sprintf

Conversation

@sfc-gh-dachristensen
Copy link
Copy Markdown

The %f format specifier for doubles can produce over 300 characters (e.g., DBL_MAX formatted with %f). The 64-byte stack buffer is insufficient. This is a stack buffer overflow exploitable via crafted agtype float values.

Fix this by both bumping the buffer size (to allow full precision) and changing all sprintf() call sites to be snprintf() calls instead.

The %f format specifier for doubles can produce over 300 characters (e.g.,
DBL_MAX formatted with %f). The 64-byte stack buffer is insufficient. This is a
stack buffer overflow exploitable via crafted agtype float values.

Fix this by both bumping the buffer size (to allow full precision) and changing
all sprintf() call sites to be snprintf() calls instead.

Signed-off-by: David Christensen <david.christensen@snowflake.com>
@jrgemignani jrgemignani requested a review from Copilot April 27, 2026 21:09
@jrgemignani
Copy link
Copy Markdown
Contributor

@sfc-gh-dachristensen Please add the following -

  • Add a regression test - without this, a future refactor could reintroduce the bug.
  • The value 400 seems a bit arbitrary. Why not some constant or number that feels less like a magic number? Fx, 512.
  • Add a comment at the buffer declaration explaining the need.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a security issue in agtype list conversion functions by removing unbounded string formatting on stack buffers when converting numeric values, preventing potential stack buffer overflows from crafted float inputs.

Changes:

  • Increased the stack buffer size used during toFloatList() float formatting.
  • Replaced sprintf() with snprintf() in toFloatList() and toStringList() conversions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 6620 to 6623
float_num = elem->val.float_value;
sprintf(buffer, "%f", float_num);
snprintf(buffer, sizeof(buffer), "%f", float_num);
string = buffer;
float_elem.val.float_value = float8in_internal_null(string, NULL, "double precision", string, &is_valid);
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions increasing the buffer to "allow full precision", but this code still formats with "%f" (default precision 6), which both loses precision and expands large values (e.g., DBL_MAX) into hundreds of digits. Consider avoiding fixed-point formatting here by using Postgres' float8out output (or a "%.*g" with DBL_DECIMAL_DIG) so the conversion is round-trippable without requiring such a large stack buffer.

Copilot uses AI. Check for mistakes.
bool is_valid = false;
float8 float_num;
char buffer[64];
char buffer[400];
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

char buffer[400]; is a new magic number on the stack. Since the required size is derived from DBL_MAX_10_EXP and the chosen formatting/precision, consider defining a named constant (or computing the bound) so it's clear why 400 is sufficient and future format changes don't accidentally reintroduce truncation/overflow risks.

Copilot uses AI. Check for mistakes.
case AGTV_INTEGER:

sprintf(buffer, "%ld", elem->val.int_value);
snprintf(buffer, sizeof(buffer), "%ld", elem->val.int_value);
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elem->val.int_value is an int64 (see struct agtype_value), but the format string uses %ld, which is not portable (e.g., long is 32-bit on some platforms) and can lead to incorrect output/UB. Use INT64_FORMAT (or pg_lltoa / int8out) when formatting int64 values.

Suggested change
snprintf(buffer, sizeof(buffer), "%ld", elem->val.int_value);
snprintf(buffer, sizeof(buffer), INT64_FORMAT, elem->val.int_value);

Copilot uses AI. Check for mistakes.
Comment on lines +6621 to 6622
snprintf(buffer, sizeof(buffer), "%f", float_num);
string = buffer;
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a security-sensitive crash vector (very large float formatted via %f). There are existing regression tests for toFloatList() in regress/sql/expr.sql, but none appear to cover extremely large float values (e.g., something near DBL_MAX). Adding a regression case that exercises toFloatList([<very large float>]) would help prevent reintroducing the overflow/truncation issue.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants