sink: add before field for avro protocol#5154
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (11)
📝 WalkthroughWalkthroughThis PR adds optional before-value inclusion to TiCDC's Avro codec. When ChangesAvro Before-Value Support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Command failed Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@wk989898: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Code Review
This pull request introduces the AvroIncludeBeforeValue configuration option, allowing Avro-encoded update and delete events to include their pre-row ('before') values under the _ticdc_before field. The changes span across configuration definitions, the Avro encoder/decoder implementations, and integration tests. Feedback on the changes highlights two key issues in the decoder: first, a suggestion to dynamically check the _tidb_op field in the decoded map to determine delete events rather than relying on the decoder's static configuration; second, a potential out-of-bounds panic when splitting the schema namespace if it does not contain a dot.
| if d.config.AvroIncludeBeforeValue { | ||
| isDelete = valueMap[tidbOp] == deleteOperation | ||
| } |
There was a problem hiding this comment.
Instead of relying on the decoder's local configuration d.config.AvroIncludeBeforeValue to determine if the event is a delete, it is much more robust to dynamically check for the presence of the _tidb_op field in the decoded valueMap. This prevents decoding failures or mismatches if the decoder's configuration does not perfectly align with the producer's configuration.
| if d.config.AvroIncludeBeforeValue { | |
| isDelete = valueMap[tidbOp] == deleteOperation | |
| } | |
| if op, ok := valueMap[tidbOp].(string); ok { | |
| isDelete = op == deleteOperation | |
| } |
| namespace := schema["namespace"].(string) | ||
| schemaName := strings.Split(namespace, ".")[1] |
There was a problem hiding this comment.
If the namespace string does not contain a dot (e.g., if the schema or keyspace is empty), strings.Split(namespace, ".")[1] will panic with an out-of-bounds index. It is safer to check the length of the split parts before accessing the index.
namespace := schema["namespace"].(string)
parts := strings.Split(namespace, ".")
var schemaName string
if len(parts) > 1 {
schemaName = parts[1]
} else {
schemaName = namespace
}
What problem does this PR solve?
Issue Number: close #5153
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note