Correctness fixes: LABEL escaping, dead columnstore guard, init port, misc by joshmarkovic · Pull Request #706 · dbt-msft/dbt-sqlserver

joshmarkovic · 2026-06-10T17:15:58Z

Batch of small, independent correctness fixes. Every change was verified against a live SQL Server 2022 container (devops/server.Dockerfile image): the relevant functional suites (test_query_options, test_index, test_data_types) pass, unit tests pass 110/110, and each fix has a targeted reproduction described below.

Note: this batch originally also fixed the sys.types join in the catalog queries (joining on system_type_id, which is not unique, fans out and produces duplicate rows / wrong type names for UDT and sysname columns). That commit was dropped while merging master: #289 (persist-docs) independently fixed the same join, so the fix is already on master and is no longer part of this PR.

Note: this batch originally also added SET XACT_ABORT ON to the dml-refresh swap (preventing a failed swap from committing an empty target). That commit has been removed from this PR: the transaction/DML behavior is being handled in #710 (dbt_sqlserver_use_dbt_transactions), which modifies the same swap macro. To keep these correctness fixes independent of that work — and avoid a conflict on the same file — the change is dropped here and deferred to #710.

1. `dbt init` suggested the Postgres port

Issue: profile_template.yml shipped port: default: 5432 — the Postgres port — so every dbt init user accepting defaults got a non-working profile.

Solution: Default changed to 1433. Verified through dbt's own InitTask.generate_target_from_input code path: accepting the default yields port: 1433 (int).

2. Python `float` mapped to SQL Server `bigint`

Issue: The datatypes mapping in sqlserver_constants.py (consumed by SQLServerConnectionManager.data_type_code_to_name to report column type names for query metadata) translated Python float to "bigint".

Solution: Map it to "float". Verified live: a cast(1.5 as float) result column produces cursor type_code = <class 'float'>, which now resolves to float.

3. A single quote in `query_tag` broke every emitted query (and allowed OPTION-clause injection)

Issue: get_query_options() (and the deprecated apply_label()) interpolated the user-supplied query_tag config into OPTION (LABEL = '...') without escaping. A tag containing ' produced a syntax error in every statement the adapter emits for that model; a crafted tag could inject arbitrary text into the OPTION clause.

Solution: Escape via dbt's cross-adapter escape_single_quotes() macro (quote doubling ' → '' on this adapter; the same helper the EXEC('...') wrappers in this repo already use) at both build sites in metadata.sql. Verified: a model with query_tag: "rob's o'clock tag" builds cleanly on both code paths, including statements wrapped in EXEC('...') (where the pre-escaped label composes correctly with the wrapper's own quote doubling), and the full test_query_options functional suite passes (26/26) against a live SQL Server 2022.

4. Columnstore-index existence guard was dead code; misleading comment on the default incremental strategy

Issue: sqlserver__create_clustered_columnstore_index guarded its DROP INDEX with object_id('<schema>_<table>') — an underscore-joined name that never resolves (verified live: always NULL). The IF EXISTS branch could therefore never fire, and re-creating a CCI on a table that already has one fails with "You cannot create more than one clustered index". Separately, the comment in incremental_strategies.sql claimed the default strategy with a unique_key performs delete+insert, when it actually emits a MERGE via get_incremental_merge_sql.

Solution: Use the relation's own quoted rendering — relation.include(database=False), which emits object_id('"schema"."table"') — in indexes.sql; the guard now finds the existing index and the macro drops + recreates it. Verified live: OBJECT_ID resolves the double-quoted form identically to brackets (including under QUOTED_IDENTIFIER OFF), and re-running the rendered batch against a table with an existing CCI drops and recreates it instead of failing. Comment corrected to say MERGE (separate commit).

Known pre-existing limitation, unchanged here: tables built via a __dbt_tmp intermediate + rename keep the index name <schema>_<table>__dbt_tmp_cci, which the macro's computed name does not match, so the guard cannot protect that case.

5. Docs/tooling: duplicated README section, black target py39, broken `make clean`

README documented dbt_sqlserver_use_default_schema_concat twice, with conflicting flags:-vs-vars: guidance. Merged into a single section matching the actual implementation (schema.sql:61-63: behavior flag first, vars as backwards-compat fallback).
pre-commit: the auto black hook still used --target-version=py39, while requires-python >= 3.10 and the manual black-check hook already targeted py310. Bumped the auto hook to py310 to match. (The original commit also bumped the isort target, but chore: consolidate isort, flake8, pycln and absolufy-imports into ruff #707 has since consolidated isort/flake8/pycln/absolufy-imports into ruff on master; after rebasing onto that, only the black target bump remains.) Verified black at py310 changes zero files, so no reformatting churn for contributors.
Makefile: the clean target had a .PHONY declaration and recipe lines but the clean: rule line itself was missing, so make clean did nothing. Restored (and it now shows in make help).

(Each of these three is its own commit.)

axellpadilla

The fixes look correct, especially SET XACT_ABORT ON for the DML refresh swap and escaping query_tag before emitting OPTION (LABEL = ...).

I would still ask for regression tests before merging this part, because these are correctness-sensitive paths: DML refresh rollback on insert failure, query tags containing single quotes, the columnstore index guard against a schema-qualified table, and float type inference (this probably not a test, maybe there is a way to include as part of an existing one so we have better correctnes for example this also improves this part #702).

One concern with SET XACT_ABORT ON is that it is a session-level setting. If the dbt adapter reuses the same SQL Server connection for subsequent statements, leaving it enabled could change the behavior of later SQL executed on that connection. For example, a later statement that previously tolerated a recoverable runtime error inside a transaction could instead cause the whole transaction to abort and roll back. That may be desirable, but it should be evaluated as an intentional behavior change rather than left as an implicit side effect.

I would either reset XACT_ABORT after this macro, document why leaving it enabled is safe for this adapter’s connection/session lifecycle, and/or split this into a separate commit so the transaction-behavior change is isolated from the other fixes.

Edit note:

One part where I think this XACT_ABORT is undesirable is on posthooks, current behavior is data remains updated even after post_hooks fail, would this be a whole behavior change around that it seems?

axellpadilla · 2026-06-15T06:41:19Z

@joshmarkovic touched dml on #710 I suggest separating those edits from the other corrections.

A query_tag containing a single quote broke every query the adapter emitted, and allowed injection into the OPTION clause. Escape via dbt's cross-adapter escape_single_quotes() macro (quote doubling on this adapter), the same helper the EXEC('...') wrappers here already use.

The underscore-joined name never resolves, so the existence check was always false and the DROP never ran. Use the relation's own quoted rendering (relation.include(database=False)), which OBJECT_ID resolves.

With a unique_key the default strategy emits a MERGE via get_incremental_merge_sql, not delete+insert as the comment claimed.

README documented dbt_sqlserver_use_default_schema_concat twice with conflicting flags-vs-vars guidance; merged into one section matching the code (behavior flag primary, vars fallback).

black/isort pre-commit hooks targeted py39 while requires-python is >=3.10 and the manual black-check already used py310; aligned both.

The clean target had a .PHONY declaration and recipe lines but no 'clean:' rule line, so 'make clean' did nothing.

joshmarkovic · 2026-06-15T23:54:57Z

@axellpadilla, thanks for the review!

I removed the SET XACT_ABORT ON / dml-refresh commit from the PR. We can put it in #710, since it edits the same macro and is where transactions are handled.

I rebased onto the latest master to fix pre-commit.ci and testing seems to be covered: the query-tag, columnstore, and float fixes are already covered. Happy to add any anything else, let me know!

axellpadilla · 2026-06-16T01:43:08Z

@joshmarkovic can you send the xact abort changes pr to #710 branch please, I think this is safer at least without carefully inspecting implications, but you can also justify just keeping it on after checking all flow after this DML + post-hook (in-transaction, post-commit), considering flag enabled in #710.

-- 1. Declare a variable to store the original state
DECLARE @OriginalXactAbort INT;

-- 2. Check if the 16384 bit is set in @@OPTIONS
-- If the bitwise AND returns > 0, it was ON (1), otherwise OFF (0)
SET @OriginalXactAbort = CASE WHEN (@@OPTIONS & 16384) > 0 THEN 1 ELSE 0 END;

-- 3. Set XACT_ABORT to your desired state for the critical workload
SET XACT_ABORT ON;

BEGIN TRANSACTION;
    -- Your transactional logic / queries go here
    -- e.g., INSERT INTO MyTable ...
COMMIT TRANSACTION;

-- 4. Restore the original state dynamically
IF @OriginalXactAbort = 1
    SET XACT_ABORT ON;
ELSE
    SET XACT_ABORT OFF;

joshmarkovic marked this pull request as ready for review June 11, 2026 15:01

joshmarkovic force-pushed the fix/audit-quick-wins branch from c479a29 to 1b56f99 Compare June 12, 2026 13:14

joshmarkovic changed the title ~~Correctness fixes: dml-refresh data loss, catalog UDT types, LABEL escaping, dead columnstore guard, init port, misc~~ Correctness fixes: dml-refresh data loss, LABEL escaping, dead columnstore guard, init port, misc Jun 12, 2026

joshmarkovic force-pushed the fix/audit-quick-wins branch 2 times, most recently from 4801c2e to 8da3878 Compare June 12, 2026 13:29

joshmarkovic mentioned this pull request Jun 12, 2026

chore: consolidate isort, flake8, pycln and absolufy-imports into ruff #707

Merged

3 tasks

axellpadilla added this to the v1.10.1rc1 milestone Jun 14, 2026

axellpadilla requested changes Jun 14, 2026

View reviewed changes

joshmarkovic force-pushed the fix/audit-quick-wins branch from 8da3878 to 994a9d5 Compare June 15, 2026 23:29

joshmarkovic changed the title ~~Correctness fixes: dml-refresh data loss, LABEL escaping, dead columnstore guard, init port, misc~~ Correctness fixes: LABEL escaping, dead columnstore guard, init port, misc Jun 15, 2026

joshmarkovic added 8 commits June 15, 2026 23:34

fix: default port 1433 (not Postgres 5432) in dbt init profile template

1c6ddee

fix: map Python float to SQL Server float, not bigint

245726f

fix: columnstore IF EXISTS guard checked object_id('schema_table')

63298fb

The underscore-joined name never resolves, so the existence check was always false and the DROP never ran. Use the relation's own quoted rendering (relation.include(database=False)), which OBJECT_ID resolves.

docs: correct incremental default-strategy comment to MERGE

48d2aea

With a unique_key the default strategy emits a MERGE via get_incremental_merge_sql, not delete+insert as the comment claimed.

docs: de-dupe README schema-concat section

9749d30

README documented dbt_sqlserver_use_default_schema_concat twice with conflicting flags-vs-vars guidance; merged into one section matching the code (behavior flag primary, vars fallback).

chore: target py310 in black/isort pre-commit args

2b8195c

black/isort pre-commit hooks targeted py39 while requires-python is >=3.10 and the manual black-check already used py310; aligned both.

fix: restore missing Makefile clean rule line

b5e142b

The clean target had a .PHONY declaration and recipe lines but no 'clean:' rule line, so 'make clean' did nothing.

joshmarkovic force-pushed the fix/audit-quick-wins branch from 994a9d5 to b5e142b Compare June 15, 2026 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctness fixes: LABEL escaping, dead columnstore guard, init port, misc#706

Correctness fixes: LABEL escaping, dead columnstore guard, init port, misc#706
joshmarkovic wants to merge 8 commits into
dbt-msft:masterfrom
joshmarkovic:fix/audit-quick-wins

joshmarkovic commented Jun 10, 2026 •

edited

Loading

Uh oh!

axellpadilla left a comment •

edited

Loading

Uh oh!

axellpadilla commented Jun 15, 2026

Uh oh!

joshmarkovic commented Jun 15, 2026 •

edited

Loading

Uh oh!

axellpadilla commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joshmarkovic commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. dbt init suggested the Postgres port

2. Python float mapped to SQL Server bigint

3. A single quote in query_tag broke every emitted query (and allowed OPTION-clause injection)

4. Columnstore-index existence guard was dead code; misleading comment on the default incremental strategy

5. Docs/tooling: duplicated README section, black target py39, broken make clean

Uh oh!

axellpadilla left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

axellpadilla commented Jun 15, 2026

Uh oh!

joshmarkovic commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

axellpadilla commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshmarkovic commented Jun 10, 2026 •

edited

Loading

1. `dbt init` suggested the Postgres port

2. Python `float` mapped to SQL Server `bigint`

3. A single quote in `query_tag` broke every emitted query (and allowed OPTION-clause injection)

5. Docs/tooling: duplicated README section, black target py39, broken `make clean`

axellpadilla left a comment •

edited

Loading

joshmarkovic commented Jun 15, 2026 •

edited

Loading

axellpadilla commented Jun 16, 2026 •

edited

Loading