Skip to content

feat: safe type expansion, NVARCHAR/NCHAR catalog fix, seed empty cell fix#702

Open
axellpadilla wants to merge 2 commits into
masterfrom
fix/missing-type-handling
Open

feat: safe type expansion, NVARCHAR/NCHAR catalog fix, seed empty cell fix#702
axellpadilla wants to merge 2 commits into
masterfrom
fix/missing-type-handling

Conversation

@axellpadilla

Copy link
Copy Markdown
Collaborator

Summary

This PR addresses several long-standing issues with SQL Server native type handling during column expansion, catalog generation, and seed ingestion, considered similar problems, reproduced and handled on one run.

Closes: #701, #637, #425, #446
Supersedes: #606, thanks @Cogito

What Changed

1. SQL Server Native String Type Recognition (sqlserver_column.py)

  • is_string() now includes nvarchar and nchar in addition to varchar and char
  • string_type_instance() — new instance method that preserves the original type family:
    • nvarchar(n) emits nvarchar(n) (not varchar(n))
    • nchar(n) emits nchar(n) (not char(n))
    • Falls back to varchar(n) / char(n) for non-Unicode types
  • data_type property now uses string_type_instance() instead of string_type()
  • is_number() now includes is_fixed_numeric() so money/smallmoney participate in numeric checks without being classified as is_numeric()
  • is_fixed_numeric() — new method for money/smallmoney
  • is_numeric() now excludes money/smallmoney (breaking change — see migration note below)
  • is_integer() now includes tinyint and bit
  • can_expand_to() — stricter: only allows same-family string size increases (e.g., varchar(10)varchar(25))
  • can_expand_safe() — new method for flag-gated safe expansions:
Source Target Allowed?
varchar(n) nvarchar(m) where m >= n With flag
char(n) nchar(m) where m >= n With flag
bittinyintsmallintintbigint Higher in family With flag
int numeric(p,s) where p >= 10 With flag
numeric(p,s) numeric(p2,s2) where p2 >= p and s2 >= s With flag
smallmoney money With flag
money numeric(p,s) where p >= 19 With flag

2. Safe Type Expansion Feature Flag (sqlserver_adapter.py)

New dbt_sqlserver_enable_safe_type_expansion behaviour flag (default: false):

# dbt_project.yml
flags:
  dbt_sqlserver_enable_safe_type_expansion: true

When enabled, the adapter's expand_column_types() override performs:

  1. Same-family string resizes — always proceed (e.g., varchar(10)varchar(25))
  2. Safe type expansions — only when flag is enabled AND column_type_expansion_max_rows is not exceeded:
    • Cross-family string: varchar/charnvarchar/nchar
    • Integer family promotions
    • Integer → numeric with sufficient precision
    • numeric/decimal precision/scale upgrades
    • Fixed-money promotions (smallmoneymoneynumeric)

expand_target_column_types() — new public API that forwards the max_rows parameter, called from incremental and snapshot materializations.

alter_column_type() — new method that dispatches to the sqlserver__alter_column_type macro, replacing the base adapter's implementation.

3. Row-Count Guardrail (column_type_expansion_max_rows)

New per-model config (default: 1,000,000):

{{ config(materialized='incremental', unique_key='id',
           column_type_expansion_max_rows=500000) }}
  • Safe type expansion is skipped when the table exceeds this row count
  • Set to -1 to disable the check
  • Set to 0 to always skip safe expansion (only same-family string resizes proceed)
  • Skipped expansions emit a warning log with the row count and limit

4. Single ALTER COLUMN Mode (prefer_single_alter_column)

New per-model config (default: false):

{{ config(materialized='incremental', unique_key='id',
           prefer_single_alter_column=true) }}

When true, the sqlserver__alter_column_type macro uses a single ALTER COLUMN statement instead of the safer add+update+drop+rename pattern. This is faster for small/medium tables and instant for safe type expansions, but may fail for types that cannot be implicitly converted.

5. Catalog Fix (catalog.sql)

Changed sys.types join from system_type_id to user_type_id in both catalog queries. This prevents NVARCHAR/NCHAR columns from appearing as SYSNAME in dbt docs generate output. Fixes #637.

6. Seed Empty Cell Fix (helpers.sql)

Changed seed CSV ingestion to inline NULL literals instead of binding empty cells as SQL parameters. Previously, an empty cell in a numeric(18,0) column would be bound as an empty string parameter, causing arithmetic overflow error 8115. Now empty cells emit null directly in the VALUES clause. Fixes #425.

7. Adapter Configs (sqlserver_configs.py)

Added two new optional config fields:

  • prefer_single_alter_column: Optional[bool] = False
  • column_type_expansion_max_rows: Optional[int] = None

8. Unit Tests

  • test_sqlserver_column.py — Tests for is_string(), string_type_instance(), data_type, is_fixed_numeric(), is_numeric(), string_size() across all string/numeric type families
  • test_can_expand_to.py — Parameterized tests for can_expand_to() and can_expand_safe() covering same-family resizes, cross-family promotions, integer family promotions, numeric precision/scale upgrades, fixed-money promotions, and prevented shrinking conversions
  • test_expand_column_types.py — Tests for the adapter's expand_column_types() method: row-count skip, max-rows=0 blocking, warning emission, max_rows forwarding through expand_target_column_types()

9. Functional Tests


Breaking Changes / Migration Notes

  • money and smallmoney columns are no longer classified as is_numeric(). If you have custom code or macros that depend on money being numeric:
    • Use is_number() (covers all numeric types including money)
    • Use is_fixed_numeric() for money types specifically
    • Use is_numeric() only for numeric/decimal types

Related PRs & History

…l fix

- Add dbt_sqlserver_enable_safe_type_expansion flag for safe column type widening
  (varchar->nvarchar, integer family promotions, numeric precision/scale upgrades)
- Add column_type_expansion_max_rows config (default 1,000,000 rows)
- Add prefer_single_alter_column config for single ALTER COLUMN statement
- Add string_type_instance() to preserve NVARCHAR/NCHAR type family
- Fix catalog generation (user_type_id) so NVARCHAR/NCHAR no longer appear as SYSNAME
- Fix is_numeric() to exclude money/smallmoney (now is_fixed_numeric())
- Fix seed table ingestion of empty numeric cells
- Add tinyint/bit to is_integer() type list
@Benjamin-Knight

Copy link
Copy Markdown
Collaborator

Reviewing now, big change with lots of potential for issues so may take a bit longer.

@Benjamin-Knight

Copy link
Copy Markdown
Collaborator

When safe type expansion is enabled, some column changes that don't actually fit the existing values. For example, it will widen an int to numeric(10,5) or numeric(10,2) to numeric(12,5), but in those instances we can overflow the integer portion. We can't just check the overall precision and scale, we need to check that the integer section is wide enough.

@Benjamin-Knight Benjamin-Knight left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think there is an issue with the expansion of integers around precision and scale but otherwise just that one shadowed function I don't understand the basis of.

self.alter_column_type(current, column_name, new_type)

@available.parse_none
def expand_target_column_types(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not just a copy of the base SQLAdapter code, what is the override doing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants