[flink] add support for flink 2.3#3521
Open
sd4324530 wants to merge 3 commits into
Open
Conversation
Signed-off-by: Pei Yu <125331682@qq.com>
781644b to
d5a43c4
Compare
…Case
Flink 2.3 introduces ExecutionConfigOptions.TABLE_EXEC_SINK_REQUIRE_ON_CONFLICT
(table.exec.sink.require-on-conflict), defaulting to true. In
FlinkChangelogModeInferenceProgram, this triggers a ValidationException
("upsert key differs from primary key") before the
StreamPhysicalDeltaJoinForceValidator runs, so the Delta Join ITCases can no
longer reach the original "doesn't support to do delta join optimization"
error path.
Disable the option in Flink23DeltaJoinITCase#beforeEach so the existing
assertions remain valid. Production-side impact (real Fluss users hitting
this on multi-table joins / group-by + insert) is left to community
discussion.
Signed-off-by: Pei Yu <125331682@qq.com>
d5a43c4 to
859d4b7
Compare
…Case
Flink 2.3 introduces ExecutionConfigOptions.TABLE_EXEC_SINK_REQUIRE_ON_CONFLICT
(table.exec.sink.require-on-conflict), defaulting to true. In
FlinkChangelogModeInferenceProgram, this triggers a ValidationException
("upsert key differs from primary key") before the partial-update handling
in FlinkTableSink#getSinkRuntimeProvider runs, so the partial upsert ITCases
(testPartialUpsert and testPartialUpsertDuringAddColumn) in
FlinkTableSinkITCase can no longer reach the Fluss sink layer.
Disable the option in Flink23TableSinkITCase#beforeEach so the existing
partial-upsert assertions remain valid.
Signed-off-by: Pei Yu <125331682@qq.com>
bd9decf to
018112e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #3520
This PR adds Flink 2.3 engine support to Apache Fluss by introducing a new
fluss-flink-2.3module. To accommodate the API changes inCatalogMaterializedTable/IntervalFreshnessintroduced in Flink 2.3, corresponding version adapters are introduced in the common module so that cross-version connector code can compile against multiple Flink majors.Brief change log
1. New
fluss-flink-2.3modulefluss-flink-2.3as a sub-module influss-flink/pom.xmlto provide connector support for Flink 2.3-based deployments.fluss-flink-2.2and ships with complete source and test directories. Version-specific adapters (e.g.,MultipleParameterToolAdapter,SchemaAdapter,SinkAdapter,TypeInformationAdapter) follow the existing pattern and are provided alongside the module.2. New version adapters in the common module
To handle the API changes of
CatalogMaterializedTable/IntervalFreshnessin Flink 2.3, the following adapters are introduced underfluss-flink-common/src/main/java/org/apache/fluss/flink/adapter/:CatalogMaterializedTableAdapter: wrapsCatalogMaterializedTable.Builderto abstract away differences introduced in Flink 2.3 (such as the neworiginalQuery/expandedQueryfields), so the shared common code does not depend on a specific Flink version.IntervalFreshnessAdapter: a new adapter forIntervalFreshnessand its innerTimeUnitenum. In Flink 2.3 theIntervalFreshness.TimeUnittype has been reworked (moved/repackaged), so this adapter provides a unified way to parse and serialize time units — converting betweenStringnames and the version-specificTimeUnitenum (via theTimeUnitAdapterwrapper). The common module no longer needs to depend on a Flink-version-specificIntervalFreshness.TimeUnitclass, and the existingMATERIALIZED_TABLE_INTERVAL_FRESHNESS_TIME_UNITconfig can keep itsstringType()form unchanged.3. Configuration and serialization compatibility
FlinkConnectorOptions.MATERIALIZED_TABLE_INTERVAL_FRESHNESS_TIME_UNITis changed fromenumType(IntervalFreshness.TimeUnit.class)tostringType(), avoiding a hard dependency on a Flink-version-specific enum class in the common module; concrete enum parsing is delegated toIntervalFreshnessAdapter.FlinkConversionsis updated to use the new adapters for materialization-table serialization/deserialization, ensuring consistent semantics across versions.4. Back-port to fluss-flink-2.2
CatalogMaterializedTableAdapteris also added tofluss-flink-2.2, andFlink22CatalogTestis updated accordingly, so that the 2.2 module continues to share the same common code after the new adapter is introduced.5. Test refactor: template-method pattern for version-specific constructors
The old
ResolvedCatalogMaterializedTableAdapter.create()helper masked the Flink-version-specificResolvedCatalogMaterializedTableconstructor signature with a 2-arg fake. To support Flink 2.3's new 5-arg constructor (which addsStartMode), the catalog test hierarchy is refactored to a template-method pattern:FlinkCatalogTest(influss-flink-common) now exposes aprotectedcreateResolvedCatalogMaterializedTable(...)method with a default 2-arg-constructor implementation.Flink22CatalogTestoverrides it to use the 4-arg constructor (origin,resolvedSchema,refreshMode,intervalFreshness).Flink23CatalogTestoverrides it to use the 5-arg constructor (additionally passingStartMode.of(StartMode.StartModeKind.FROM_BEGINNING)).This lets each Flink version exercise its native constructor signature, removes the need for the static helper, and makes the test code self-documenting about which Flink version it targets. Note that the parent
FlinkCatalogTestno longer importsResolvedCatalogMaterializedTableAdapter.6. Test coverage
Flink23*ITCase) is added under thefluss-flink-2.3module, covering catalog, metrics, procedure, authorization, sink, source (including binlog/changelog virtual tables, delta join, failover), and tiering.Flink23MultipleParameterToolTestis added to validateMultipleParameterToolAdapterbehavior.FlinkCatalogTestandFlink22CatalogTestare updated for the adapter-related cases.7. Test compatibility fix: Flink 2.3
ON CONFLICTvalidation vs. Delta Join testsWhile porting the Delta Join ITCases to Flink 2.3, an unexpected upstream planner behavior change was discovered. This PR works around it in the test code only; the broader impact on Fluss end-users upgrading to Flink 2.3 is left for community discussion.
What changed in Flink 2.3. Flink 2.3 introduces a new planner option
ExecutionConfigOptions.TABLE_EXEC_SINK_REQUIRE_ON_CONFLICT(table.exec.sink.require-on-conflict, defaulttrue). InsideFlinkChangelogModeInferenceProgram.SatisfyUpdateKindTraitVisitor.analyzeUpsertMaterializeStrategy, Flink now throwsValidationException("The query has an upsert key that differs from the primary key of the sink table ...")whenever:TABLE_EXEC_SINK_UPSERT_MATERIALIZE = AUTO(default), andON CONFLICTclause is supplied.In Flink 2.2 this validation did not exist.
Why this affects Fluss's Delta Join tests. Delta Join semantics define the upsert key from the join condition, not from the sink's primary key. In several Flink23DeltaJoinITCase scenarios (e.g.,
testDeltaJoinWithJoinKeyExceedsPrimaryKeywith join conditionc1=c2 AND d1=d2 AND e1=e2into a sink whose PK is(c1, d1)), the upsert key(c1, d1, e1)legitimately exceeds the sink PK. Under Flink 2.2 these tests asserted thatStreamPhysicalDeltaJoinForceValidatorthrowsThe current sql doesn't support to do delta join optimization. Under Flink 2.3 the newON CONFLICTvalidator fires before the Delta Join validator is reached, so the original error message is never produced and the assertions fail.Why this PR only touches the tests. The new Flink validation is the correct behavior in the general case — silently allowing mismatched upsert keys leads to non-deterministic results at the sink, which is exactly what Flink 2.3 is trying to prevent. Disabling it in the connector would silently regress that protection for all Fluss users. Whether/how Fluss should expose this knob (e.g., as a Fluss-level connector option, or by injecting a
ConflictStrategyfromFlinkTableSinkwhen the underlying Fluss table has alast_rowmerge engine) is a product-level decision that should be discussed with the community and is out of scope for this PR.This is the minimal, scoped workaround.
Flink22DeltaJoinITCaseis untouched (the option did not exist in 2.2).8. Test compatibility fix: Flink 2.3
ON CONFLICTvalidation vs. Table Sink partial-upsert testsA second, distinct impact of the same Flink 2.3 validation was uncovered while running
Flink23TableSinkITCaseon CI: the Fluss partial-upsert test path is also blocked. Same planner option, same root cause, but a different surface — addressed with the same minimal-scoped pattern.The fix.
Flink23TableSinkITCasewas previously an empty subclass ofFlinkTableSinkITCase. This PR turns it into a proper subclass with a single@BeforeEachthat disablesTABLE_EXEC_SINK_REQUIRE_ON_CONFLICTon the streamingTableConfig, mirroring the precedent set in Section 7.Why this PR only touches the tests. Same reasoning as Section 7: Flink 2.3's validation is the correct general-case behavior, and silently turning it off at the connector level would regress the protection for real users. The proper follow-up is for Fluss to participate in the new model — most naturally by having
FlinkTableSink.applyOperations(...)(orgetSinkRuntimeProvider) inject aConflictStrategy(e.g.,DEDUPLICATE, semantically equivalent to Fluss's current first-write-wins plus target-column overwrite) when the sink is a Fluss PK table, so partial upserts remain expressible from SQL. That is a product/API change and is explicitly out of scope for this engine-port PR.Flink22TableSinkITCaseis untouched (the option did not exist in 2.2).Tests
Flink23MultipleParameterToolTestFlink23CatalogTestFlink23TieringCommitOperatorTestFlinkCatalogTest,Flink22CatalogTest(adapter-related cases updated;Flink22/23CatalogTestnow exercise their nativeResolvedCatalogMaterializedTableconstructors via the new template-method override)fluss-flink-2.3module):Flink23CatalogITCase,Flink23MaterializedTableITCaseFlink23MetricsITCaseFlink23ProcedureITCaseFlink23AuthorizationITCaseFlink23ComplexTypeITCase,Flink23TableSinkITCase(now works around Flink 2.3's newON CONFLICTplanner validation for partial upserts, see Section 8),Flink23UndoRecoveryITCaseFlink23BinlogVirtualTableITCase,Flink23ChangelogVirtualTableITCase,Flink23DeltaJoinITCase(now works around Flink 2.3's newON CONFLICTplanner validation, see Section 7),Flink23TableSourceBatchITCase,Flink23TableSourceFailOverITCase,Flink23TableSourceITCaseFlink23TieringITCaseAPI and Format
No breaking changes to the public API.
Documentation