-
Notifications
You must be signed in to change notification settings - Fork 420
Draft: feat(format): schema evolution for the Java row codec #3714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
5eb2433
b221d0e
9bef671
cee2fe2
f70fc1f
5f877af
82fc5d1
82acf94
685a46a
c462410
f59e96d
c57abc0
eabbce4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -113,6 +113,78 @@ Row format is ideal for: | |
| - **Data pipelines**: Processing data without full object reconstruction | ||
| - **Cross-language data sharing**: When data needs to be accessed from multiple languages | ||
|
|
||
| ## Schema evolution | ||
|
|
||
| Enable `.withSchemaEvolution()` on a row, array, or map codec builder to read payloads written | ||
| by older versions of the same bean. Writing always uses the current version; reading detects | ||
| the payload's version from a strict hash at the head of the payload. Java only. | ||
|
|
||
| Annotate fields added after v1 with `@ForyVersion(since = N)`: | ||
|
|
||
| ```java | ||
| @Data | ||
| public class Person { | ||
| private String name; | ||
| private int age; | ||
|
|
||
| @ForyVersion(since = 2) | ||
| private String email; | ||
| } | ||
| ``` | ||
|
|
||
| A v1 payload (with `name` and `age` only) decodes to a `Person` whose `email` is `null`. | ||
| Primitive fields added later default to `0`, `0.0`, or `false`. If a class adopts versioning | ||
| after its v1 is already in the wild, set `@ForySchema(baseVersion = N)` so unannotated fields | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is nonsense? |
||
| are treated as present since version `N`. | ||
|
|
||
| Remove a field by deleting the Java member and declaring it on a nested history interface as a | ||
| method with a `@ForyVersion(until = N)`. The method's return type carries any parameterized | ||
| type information from the original field. | ||
|
|
||
| ```java | ||
| @Data | ||
| @ForySchema(removedFields = Person.History.class) | ||
| public class Person { | ||
| private String name; | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove private |
||
|
|
||
| @ForyVersion(since = 2) | ||
| private String email; | ||
|
|
||
| interface History { | ||
| @ForyVersion(until = 3) | ||
| int age(); | ||
|
|
||
| @ForyVersion(until = 5) | ||
| List<String> tags(); | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| The history method name matches the original live descriptor name: the field name for Lombok | ||
| `@Data` or records (`age`, `tags`), or the full accessor name for JavaBeans-style classes and | ||
| interfaces (`getAge`). | ||
|
|
||
| ### Wire format and limitations | ||
|
|
||
| Producers and consumers must agree on the `withSchemaEvolution()` flag — they are not | ||
| wire-compatible otherwise. Row payloads always carry an 8-byte hash slot; under evolution its | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about the upgrade path? |
||
| value is the strict hash (which includes field name and nullability), so a flag-mismatched | ||
| peer fails loudly with `ClassNotCompatibleException`. Arrays and maps of bean elements prepend | ||
| an 8-byte strict-hash prefix under evolution and no prefix otherwise; an evolution-on consumer | ||
| reading evolution-off bytes also fails with `ClassNotCompatibleException`, but the reverse | ||
| direction (evolution-off consumer, evolution-on bytes) is undefined. | ||
|
|
||
| Cross-language consumers (Python, C++) cannot read evolution-enabled payloads. | ||
|
|
||
| Map keys do not carry a per-payload hash; a versioned bean used as a map key is read with the | ||
| current schema only, not dispatched to a projection codec. | ||
|
|
||
| When a versioned bean contains other versioned beans, the reader generates one projection codec | ||
| class per combination of versions across the composition. The count grows as the product of the | ||
| per-bean version counts. If that becomes a concern, drop entries from each bean's `History` | ||
| interface once you no longer need to read payloads from that range. Retiring a history entry is | ||
| purely a read-side decision; the writer always uses the current schema. | ||
|
|
||
| ## Cross-Language Compatibility | ||
|
|
||
| Row format works seamlessly across languages. The same binary data can be accessed from: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.fory.format.annotation; | ||
|
|
||
| import java.lang.annotation.ElementType; | ||
| import java.lang.annotation.Retention; | ||
| import java.lang.annotation.RetentionPolicy; | ||
| import java.lang.annotation.Target; | ||
|
|
||
| /** | ||
| * Class-level row-codec schema metadata used when the codec builder enables schema evolution. | ||
| * | ||
| * <p>{@link #baseVersion()} sets the implicit {@code since} for live fields that lack a | ||
| * {@link ForyVersion} annotation; this lets a class adopt versioning mid-history without having | ||
| * to annotate every existing member. | ||
| * | ||
| * <p>{@link #removedFields()} points at a class (conventionally a nested {@code interface}) whose | ||
| * accessor methods describe fields that have been removed from this bean but still appear on the | ||
| * wire in older payloads. Each method's return type is the original Java type of the removed | ||
| * field; each method must carry a {@link ForyVersion} annotation with {@code until} set, since | ||
| * removed fields have a known end-of-life version. | ||
| * | ||
| * <p>Example: | ||
| * | ||
| * <pre>{@code | ||
| * @Data | ||
| * @ForySchema(removedFields = MyBean.History.class) | ||
| * public class MyBean { | ||
| * private String name; | ||
| * | ||
| * interface History { | ||
| * @ForyVersion(until = 3) | ||
| * List<String> tags(); | ||
| * | ||
| * @ForyVersion(since = 2, until = 5) | ||
| * Map<String, Long> counters(); | ||
| * } | ||
| * } | ||
| * }</pre> | ||
| */ | ||
| @Retention(RetentionPolicy.RUNTIME) | ||
| @Target(ElementType.TYPE) | ||
| public @interface ForySchema { | ||
| int baseVersion() default 1; | ||
|
|
||
| /** | ||
| * A class whose accessor methods describe historically-present-but-now-removed fields. Default | ||
| * {@code void.class} means there are no removed fields. The class is never instantiated; the | ||
| * codec reads its method signatures and annotations. | ||
| */ | ||
| Class<?> removedFields() default void.class; | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.fory.format.annotation; | ||
|
|
||
| import java.lang.annotation.ElementType; | ||
| import java.lang.annotation.Retention; | ||
| import java.lang.annotation.RetentionPolicy; | ||
| import java.lang.annotation.Target; | ||
|
|
||
| /** | ||
| * Declares the version window in which a row-codec field is logically present. The window is | ||
| * inclusive on the left and exclusive on the right, so {@code since=2, until=5} means versions 2, | ||
| * 3, and 4. | ||
| * | ||
| * <p>Only effective when the codec builder is configured with | ||
| * {@code withSchemaEvolution()}; otherwise the annotation is ignored and the field is treated as | ||
| * always present. | ||
| */ | ||
| @Retention(RetentionPolicy.RUNTIME) | ||
| @Target({ElementType.FIELD, ElementType.METHOD, ElementType.RECORD_COMPONENT}) | ||
| public @interface ForyVersion { | ||
| /** First version (inclusive) that contains this field. Defaults to the class base version. */ | ||
| int since() default 1; | ||
|
|
||
| /** First version (exclusive) that no longer contains this field. */ | ||
| int until() default Integer.MAX_VALUE; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should probably move down