Skip to content

refactor(sampler): collapse SamplerInterface to a single typed config object#28147

Open
edg956 wants to merge 3 commits into
mainfrom
refactor-samplers
Open

refactor(sampler): collapse SamplerInterface to a single typed config object#28147
edg956 wants to merge 3 commits into
mainfrom
refactor-samplers

Conversation

@edg956
Copy link
Copy Markdown
Contributor

@edg956 edg956 commented May 15, 2026

Summary

  • Introduce SamplerConfig dataclass hierarchy (SamplerConfigDatabaseSamplerConfig / StorageSamplerConfig) in a new sampler_config.py module
  • Replace the 9-parameter SamplerInterface.__init__ / create() signature with a single config: SamplerConfig parameter — eliminates the too-many-arguments pylint suppress and removes database-specific imports from the base class
  • Move config resolution (partition_details, sample_query, include/exclude columns, sample_config, sample_data_count) to callers (entity_adapters, profiler_source, base_test_suite_source) so the interface receives already-resolved values
  • Move SSL connection setup and column include/exclude filtering down to database-family subclasses (SQASampler, PandasSampler, NoSQLSampler) — storage samplers remain clean
  • Simplify BigQuerySampler, PostgresSampler, SnowflakeSampler to *args/**kwargs init (no longer need to re-declare 8 parameters)
  • Remove StorageSampler.create() override; the base create() is now sufficient
  • Rename storage_configupload_sample_storage_config on the interface to clarify it's the destination for sample data upload, not the service connection

Test plan

  • All existing unit tests updated and passing (make unit_ingestion_dev_env)
  • Sampling tests: SQLAlchemy (BigQuery, Snowflake, Postgres, AzureSQL, MSSQL), Pandas/Datalake, storage container
  • test_sampler_interface.pystorage_configupload_sample_storage_config rename covered
  • test_container_sampler_processor.py, test_burstiq_sampler.py — adapted to new config object

🤖 Generated with Claude Code

edg956 and others added 3 commits May 15, 2026 15:14
Replace the 9-parameter constructor/create() signature with a typed
SamplerConfig hierarchy (SamplerConfig / DatabaseSamplerConfig /
StorageSamplerConfig). Config resolution — partition_details,
sample_query, include/exclude columns, sample_config, sample_data_count
— now happens in callers (entity_adapters, profiler_source,
base_test_suite_source) before construction, so the interface only
receives already-resolved values.

- Add sampler_config.py with SamplerConfig dataclass hierarchy
- Remove database-specific imports from SamplerInterface base class
- Move SSL connection setup and column include/exclude filtering to
  database-family subclasses (SQASampler, PandasSampler, NoSQLSampler)
- Simplify BigQuery/Postgres/Snowflake samplers to *args/**kwargs init
- Remove StorageSampler.create() override; base create() is sufficient
- Update profiler_source and base_test_suite_source to build
  DatabaseSamplerConfig before calling sampler_class.create()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The non-database adapter example was showing the old flat kwargs pattern
(sample_config, sample_data_count) that SamplerInterface now silently
ignores via **__. Replace with the correct "config": SamplerConfig(...)
pattern that matches the actual ContainerAdapter implementation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…check

ClassifiableEntityType includes Container which has no tableType.
The *args/**kwargs init simplified the constructor but lost the
explicit Table type annotation, triggering a basedpyright error.
Guard with isinstance(self.entity, Table) so the type checker
knows tableType is only accessed on Table entities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@edg956 edg956 force-pushed the refactor-samplers branch from 2722484 to 5a119a8 Compare May 15, 2026 13:14
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 15, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Collapses the SamplerInterface into a consolidated SamplerConfig object to simplify initialization and resolve the documentation error regarding incorrect kwargs for non-database adapters. No issues found.

✅ 1 resolved
Bug: Documentation shows incorrect kwargs for non-database adapters

📄 docs/auto-classification/add-support-for-another-entity.md:479-485
The documentation example in add-support-for-another-entity.md shows build_sampler_kwargs returning raw "sample_config" and "sample_data_count" keys, but the actual new interface expects a "config" key containing a SamplerConfig (or subclass) object. Developers following this guide would produce kwargs that are silently ignored by SamplerInterface.__init__ (which accepts **__ for unknown kwargs), resulting in default config values being used instead of the intended ones.

The actual ContainerAdapter.build_sampler_kwargs correctly uses:

"config": StorageSamplerConfig(sample_data_count=source_config.sampleDataCount)

But the docs show the old pattern that no longer works.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

🔴 Playwright Results — 1 failure(s), 13 flaky

✅ 4066 passed · ❌ 1 failed · 🟡 13 flaky · ⏭️ 92 skipped

Shard Passed Failed Flaky Skipped
✅ Shard 1 299 0 0 4
🔴 Shard 2 754 1 7 14
🟡 Shard 3 779 0 2 7
🟡 Shard 4 789 0 1 18
✅ Shard 5 709 0 0 41
🟡 Shard 6 736 0 3 8

Genuine Failures (failed on all attempts)

Features/Glossary/GlossaryWorkflow.spec.ts › should start term as Draft when glossary has reviewers (shard 2)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoHaveText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator:  locator('[data-row-key*="DraftTerm1778857619684"]').locator('.status-badge')
Expected: �[32m"Draft"�[39m
Received: �[31m"In Review"�[39m
Timeout:  15000ms

Call log:
�[2m  - Expect "toHaveText" with timeout 15000ms�[22m
�[2m  - waiting for locator('[data-row-key*="DraftTerm1778857619684"]').locator('.status-badge')�[22m
�[2m    19 × locator resolved to <div class="status-badge inReview" data-testid=""PW%'a7bc00ca.Calmb2ffcef1".DraftTerm1778857619684-status">…</div>�[22m
�[2m       - unexpected value "In Review"�[22m

🟡 13 flaky test(s) (passed on retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/DataQuality/BundleSuiteBulkOperations.spec.ts › Add test case to existing Bundle Suite (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 1 retry)
  • Features/KnowledgeCenterList.spec.ts › Knowledge Center List - Test infinite scroll/pagination (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Table pagination with sorting should works (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Should display custom properties for apiCollection in right panel (shard 4, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Tier Add, Update and Remove (shard 6, 1 retry)
  • Pages/UserDetails.spec.ts › Admin user can edit teams from the user profile (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant