Skip to content

Fix and polish seed warnings in tests#5822

Open
nickmuoh wants to merge 6 commits into
SQLMesh:mainfrom
nickmuoh:main
Open

Fix and polish seed warnings in tests#5822
nickmuoh wants to merge 6 commits into
SQLMesh:mainfrom
nickmuoh:main

Conversation

@nickmuoh
Copy link
Copy Markdown
Contributor

@nickmuoh nickmuoh commented Jun 1, 2026

Description

This PR fixes two DataFrame handling issues that made the warning fixes incomplete and hard to verify.

The first issue was in seed batch reading. CsvSeedReader.read() was yielding slices of the original DataFrame without copying them. In practice, that meant a consumer could mutate one returned batch and accidentally affect later reads from the same seed. For example, if the first batch changed a value after being returned, a later full read could reflect that mutation even though the seed content itself had not changed. That makes the behavior fragile and can also make warning-related regressions difficult to reason about because the returned batches are not actually isolated.

The solution is to return a copy of each batch instead of the slice itself. With that change, each batch is independent. Mutating one returned batch no longer changes subsequent batches or later reads from the same seed source.

The second issue was in the Snowflake DataFrame upload path. When SQLMesh prepared a pandas DataFrame for write_pandas, it preserved any existing non-default index from the source DataFrame. For example, a DataFrame with row labels [1, 2] would be passed through with that index intact. Even though the upload logic only intends to load the declared columns, carrying a custom index into the Snowflake write path is unnecessary and can lead to warning-prone or inconsistent behavior.

The fix is to normalize the DataFrame before upload by preserving the declared column order and resetting the index with drop=True. As a result, Snowflake now receives a clean DataFrame with the expected columns and a standard RangeIndex, regardless of how the input DataFrame was indexed.

Test Plan

The accompanying tests demonstrate both cases directly. The seed test shows that modifying one returned batch no longer leaks into other batches or later reads. The Snowflake test shows that a DataFrame with a non-default index is normalized before write_pandas is called, while the row data and column order remain unchanged.

Checklist

  • I have run make style and fixed any issues
  • I have added tests for my changes (if applicable)
  • All existing tests pass (make fast-test)
  • My commits are signed off (git commit -s) per the DCO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants