Critical: fix inverted logic here#2407
Critical: fix inverted logic here#2407sfc-gh-dachristensen wants to merge 1 commit intoapache:masterfrom
Conversation
strcmp(str, "") returns 0 (false) when str is empty, meaning the check is
inverted: it returns NULL when parsing succeeds and continues when parsing
fails. This allows non-numeric strings to pass through as array indices, leading
to type confusion and potentially incorrect memory access.
The strcmp logic handles most cases correctly (non-numeric strings return NULL,
valid integers pass through). However, the empty string "" is accepted as a
valid array index of 0: [10, 20, 30] #> '[""]' returns 10 instead of NULL. This
occurs because strtol("") sets lindex=0 and str="", so strcmp("", "") returns 0,
bypassing the error check.
Signed-off-by: David Christensen <david.christensen@snowflake.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to fix agtype path array-index parsing so that an empty string ("") is not accepted as a valid array index (currently treated as 0), which can allow invalid/non-numeric path elements to be used as array subscripts.
Changes:
- Updates the
strtol()parse validation check fromif (strcmp(str, ""))toif (strcmp(str, "") != 0)withinget_agtype_path_all().
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/backend/utils/adt/agtype_ops.c:2114
strtol()overflow is not checked viaerrno. On platforms wheresizeof(long) == sizeof(int)(e.g., 32-bit builds), an out-of-range value can return LONG_MAX/LONG_MIN (equal to INT_MAX/INT_MIN) and pass the subsequentlindex > INT_MAX || lindex < INT_MINguard. Consider following the pattern used insrc/backend/utils/adt/graphid.c:75-78by settingerrno = 0beforestrtol()and returning NULL whenerrno != 0.
char* str = NULL;
lindex = strtol(cur_key->val.string.val, &str, 10);
if (strcmp(str, "") != 0)
{
PG_RETURN_NULL();
}
}
else
{
PG_RETURN_NULL();
}
if (lindex > INT_MAX || lindex < INT_MIN)
{
PG_RETURN_NULL();
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| lindex = strtol(cur_key->val.string.val, &str, 10); | ||
|
|
||
| if (strcmp(str, "")) | ||
| if (strcmp(str, "") != 0) | ||
| { | ||
| PG_RETURN_NULL(); | ||
| } |
There was a problem hiding this comment.
The new condition strcmp(str, "") != 0 is logically equivalent to the previous if (strcmp(str, "")) and still does not reject the empty-string case described in the PR. When the input is "" strtol() leaves str pointing at the start of the string, so strcmp(str, "") is 0 and this check is bypassed, accepting index 0. To reject empty strings (and generally ensure at least one digit was consumed), check str == cur_key->val.string.val (endptr == start) in addition to requiring endptr to be at the string terminator.
| if (strcmp(str, "") != 0) | ||
| { | ||
| PG_RETURN_NULL(); | ||
| } |
There was a problem hiding this comment.
This change is fixing path/index parsing for the #> / #>> operators, and the repo already has regression coverage for these operators (e.g., regress/sql/jsonb_operators.sql section "Agtype path extraction operators"). Please add a regression test case that asserts ... #> '[""]' returns NULL (and ideally that a large out-of-range numeric string is rejected), to prevent this from regressing again.
|
@sfc-gh-dachristensen Please fix the above Copilot issues. |
strcmp(str, "") returns 0 (false) when str is empty, meaning the check is inverted: it returns NULL when parsing succeeds and continues when parsing fails. This allows non-numeric strings to pass through as array indices, leading to type confusion and potentially incorrect memory access.
The strcmp logic handles most cases correctly (non-numeric strings return NULL, valid integers pass through). However, the empty string "" is accepted as a valid array index of 0: [10, 20, 30] #> '[""]' returns 10 instead of NULL. This occurs because strtol("") sets lindex=0 and str="", so strcmp("", "") returns 0, bypassing the error check.