Skip to content

[CALCITE-7607] DML type coercion inserts implicit narrowing casts for target-column assignments#5041

Open
zzwqqq wants to merge 1 commit into
apache:mainfrom
zzwqqq:fix_dml_checked_assignment_simple
Open

[CALCITE-7607] DML type coercion inserts implicit narrowing casts for target-column assignments#5041
zzwqqq wants to merge 1 commit into
apache:mainfrom
zzwqqq:fix_dml_checked_assignment_simple

Conversation

@zzwqqq

@zzwqqq zzwqqq commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Jira Link

CALCITE-7607

Changes Proposed

Avoid adding implicit narrowing casts for DML assignments when the source type can be assigned to the target type.

The validator no longer adds these casts during DML source coercion, and TableModify#getExpectedInputRowType is adjusted so CoerceInputsRule does not add them back later.

Added tests for INSERT, UPDATE, and MERGE, and updated existing plan output where these assignment casts used to appear.

@sonarqubecloud

Copy link
Copy Markdown


# Create a source table
create table dept (deptno int not null, name varchar(10));
create table dept (deptno int not null, name varchar(20));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a specific SQL test and compare the results? You can add relevant comments here.

@zzwqqq zzwqqq Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. The old output depended on the implicit assignment cast to varchar(10), which truncated Engineering to Engineerin.

That truncation is not what this test is intended to cover, so I widened the column to keep the test focused on CREATE TABLE AS/MV behavior.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But was the behavior incorrect?
What is the new behavior if you don't change the type?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With either varchar(10) or varchar(20), the new output is Engineering, not Engineerin, because the implicit narrowing cast is no longer added.

I widened the column because I wanted to avoid implying that over-length inserts into varchar(10) are intended here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the correct behavior if the type is varchar(10)?
There must be a spec or a reference that has to match in behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this in PostgreSQL, MySQL and Oracle, and they all reject inserting an over-length value.
https://onecompiler.com/postgresql/44sf3dz68
https://onecompiler.com/mysql/44sf3efz4
https://onecompiler.com/oracle/44sf3ewmh

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is good. But is there code in the handling of assignments in Calcite which will perform the rejection?
Is this in scope for this PR or does it need another issue?
This also suggests that casts should not be inserted, since they would transform illegal values into legal ones.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn’t find any Enumerable-side code that rejects over-length values. My current view is that adding that enforcement would be a separate issue from this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please file an issue about this and perhaps even refer to it in your code if the results are currently wrong?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I filed CALCITE-7627 to track the Enumerable-side case, and I can look into it separately after this pr.
The results are currently correct, because I widened the column.

HepProgram program = new HepProgramBuilder()

// Simulate the way INSERT will insert casts to the target types
// Simulate the way INSERT can insert casts to the target types.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this comment needs to be changed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it because CoerceInputsRule may now decide not to add a cast. I can revert this comment if you think it is better to avoid the unrelated diff.

+ "empid=[$t3], deptno=[$t4], name=[$t5], salary=[$t6], "
+ "commission=[$t7])\n"
+ " EnumerableValues(tuples=[[{ 'Fred', 56, 123.4000015258789E0 }]])\n";
+ " EnumerableValues(tuples=[[{ 'Fred', 56, 123.4 }]])\n";

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These variations in precision may also require an explanation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original SQL uses 123.4. The old plan cast the value through a narrower floating-point type, which produced 123.4000015258789E0; without that cast the value stays 123.4.

!update

# Check contents; "Engineering" is too long for varchar(10)
# Check contents

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you say this test should produce an error rather than truncating the string?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it should fail rather than truncate.

@mihaibudiu mihaibudiu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes make sense, but they do uncover a separate issue: updating with a value out of range should produce a runtime error rather than a silent cast which may lose data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants