[CALCITE-7607] DML type coercion inserts implicit narrowing casts for target-column assignments#5041
[CALCITE-7607] DML type coercion inserts implicit narrowing casts for target-column assignments#5041zzwqqq wants to merge 1 commit into
Conversation
… target-column assignments
|
|
|
||
| # Create a source table | ||
| create table dept (deptno int not null, name varchar(10)); | ||
| create table dept (deptno int not null, name varchar(20)); |
There was a problem hiding this comment.
Could you provide a specific SQL test and compare the results? You can add relevant comments here.
There was a problem hiding this comment.
Thanks for taking a look. The old output depended on the implicit assignment cast to varchar(10), which truncated Engineering to Engineerin.
That truncation is not what this test is intended to cover, so I widened the column to keep the test focused on CREATE TABLE AS/MV behavior.
There was a problem hiding this comment.
But was the behavior incorrect?
What is the new behavior if you don't change the type?
There was a problem hiding this comment.
With either varchar(10) or varchar(20), the new output is Engineering, not Engineerin, because the implicit narrowing cast is no longer added.
I widened the column because I wanted to avoid implying that over-length inserts into varchar(10) are intended here
There was a problem hiding this comment.
What is the correct behavior if the type is varchar(10)?
There must be a spec or a reference that has to match in behavior.
There was a problem hiding this comment.
I tested this in PostgreSQL, MySQL and Oracle, and they all reject inserting an over-length value.
https://onecompiler.com/postgresql/44sf3dz68
https://onecompiler.com/mysql/44sf3efz4
https://onecompiler.com/oracle/44sf3ewmh
There was a problem hiding this comment.
That is good. But is there code in the handling of assignments in Calcite which will perform the rejection?
Is this in scope for this PR or does it need another issue?
This also suggests that casts should not be inserted, since they would transform illegal values into legal ones.
There was a problem hiding this comment.
I couldn’t find any Enumerable-side code that rejects over-length values. My current view is that adding that enforcement would be a separate issue from this PR.
There was a problem hiding this comment.
Can you please file an issue about this and perhaps even refer to it in your code if the results are currently wrong?
There was a problem hiding this comment.
Sure, I filed CALCITE-7627 to track the Enumerable-side case, and I can look into it separately after this pr.
The results are currently correct, because I widened the column.
| HepProgram program = new HepProgramBuilder() | ||
|
|
||
| // Simulate the way INSERT will insert casts to the target types | ||
| // Simulate the way INSERT can insert casts to the target types. |
There was a problem hiding this comment.
I don't understand why this comment needs to be changed.
There was a problem hiding this comment.
I changed it because CoerceInputsRule may now decide not to add a cast. I can revert this comment if you think it is better to avoid the unrelated diff.
| + "empid=[$t3], deptno=[$t4], name=[$t5], salary=[$t6], " | ||
| + "commission=[$t7])\n" | ||
| + " EnumerableValues(tuples=[[{ 'Fred', 56, 123.4000015258789E0 }]])\n"; | ||
| + " EnumerableValues(tuples=[[{ 'Fred', 56, 123.4 }]])\n"; |
There was a problem hiding this comment.
These variations in precision may also require an explanation.
There was a problem hiding this comment.
The original SQL uses 123.4. The old plan cast the value through a narrower floating-point type, which produced 123.4000015258789E0; without that cast the value stays 123.4.
| !update | ||
|
|
||
| # Check contents; "Engineering" is too long for varchar(10) | ||
| # Check contents |
There was a problem hiding this comment.
So you say this test should produce an error rather than truncating the string?
There was a problem hiding this comment.
Yes, I think it should fail rather than truncate.
mihaibudiu
left a comment
There was a problem hiding this comment.
I think these changes make sense, but they do uncover a separate issue: updating with a value out of range should produce a runtime error rather than a silent cast which may lose data.



Jira Link
CALCITE-7607
Changes Proposed
Avoid adding implicit narrowing casts for DML assignments when the source type can be assigned to the target type.
The validator no longer adds these casts during DML source coercion, and
TableModify#getExpectedInputRowTypeis adjusted soCoerceInputsRuledoes not add them back later.Added tests for
INSERT,UPDATE, andMERGE, and updated existing plan output where these assignment casts used to appear.