Skip to content

feat(compiler): Add JavaScript/TypeScript IDL code generation#3394

Open
miantalha45 wants to merge 64 commits intoapache:mainfrom
miantalha45:generate-messageTypes-from-IDL
Open

feat(compiler): Add JavaScript/TypeScript IDL code generation#3394
miantalha45 wants to merge 64 commits intoapache:mainfrom
miantalha45:generate-messageTypes-from-IDL

Conversation

@miantalha45
Copy link
Copy Markdown
Contributor

@miantalha45 miantalha45 commented Feb 23, 2026

Summary

Implements TypeScript code generation for Fory IDL within the fory-compiler, converting FDL (Fory Definition Language) schema files into pure TypeScript type definitions. Zero runtime dependencies, with comprehensive test coverage (12/12 tests passing), supporting messages, enums, unions, and all primitive types.

Changes

Core Implementation

  • compiler/fory_compiler/generators/typescript.py - TypeScript code generator extending BaseGenerator (365 lines)

    • Generates type-safe TypeScript interfaces, enums, and discriminated unions
    • Supports nested types, collections, and optional fields
    • Proper type mapping for all 25 FDL primitive kinds
    • Field name conversion (snake_case → camelCase)
    • Registration helper function generation for Fory serialization integration
  • compiler/fory_compiler/generators/init.py - Registration of TypeScriptGenerator in the compiler ecosystem

  • compiler/fory_compiler/cli.py - Added --typescript_out CLI argument for TypeScript code generation

  • compiler/fory_compiler/tests/test_typescript_codegen.py - 12 golden codegen tests covering:

    • Enum and message generation
    • Nested types (messages and enums)
    • Discriminated unions
    • All primitive type mappings
    • Collection types (arrays, maps)
    • Field naming conventions
    • File structure and licensing
    • Zero runtime dependencies validation

Features

  • Message-to-Interface Generation: Auto-converts FDL messages to TypeScript interfaces
  • Enum Support: Generates TypeScript enums with proper value stripping and type IDs
  • Discriminated Unions: Creates union types with discriminator enums for type safety
  • Type Mappings: Full support for all FDL primitives (bool→boolean, int32→number, int64→bigint|number, float/double→number, etc.)
  • Nested Types: Supports nested messages and enums within parent types
  • Collection Types: Arrays (repeated fields) and maps with type-safe generics
  • Zero Runtime Dependencies: Pure TypeScript type definitions, no gRPC or external imports
  • Field Naming: Automatic conversion to camelCase per TypeScript conventions
  • Package/Module Handling: Uses last segment of package name for module name and registration functions
  • License Headers: All generated files include Apache 2.0 license headers
  • Registration Helpers: Generates registration functions for Fory serialization framework integration

AI Assistance Checklist

  • Substantial AI assistance was used in this PR (yes)
  • I included the standardized AI Usage Disclosure block below
  • I can explain and defend all important changes without AI help
  • I reviewed AI-assisted code changes line by line before submission
  • I ran adequate human verification and recorded evidence (local/CI checks, pass/fail summary, and review confirmation)
  • I added/updated tests and specs where required
  • I validated protocol/performance impacts with evidence when applicable
  • I verified licensing and provenance compliance

AI Usage Disclosure

  • substantial_ai_assistance: yes
  • scope: <design drafting | code drafting>
  • affected_files_or_subsystems: <high-level paths/modules>
  • human_verification: <checks run locally or in CI + pass/fail summary + contributor reviewed results>
  • performance_verification: <N/A>
  • provenance_license_confirmation: <Apache-2.0-compatible provenance confirmed; no incompatible third-party code introduced>

AI Review Evidence

General AI Review

Initial Review
• Identified issues in decimal type mapping, array type precedence, tag id drop, and traversalContainer doesn't traverse union cases
Fixes Applied
• Updated DECIMAL mapping to string to avoid precision loss
• fix array type precedence issue.
• tag id drop prevention for id == 0
• Aligned map type documentation with generated output
Final Review (Fresh Session)
• Re reviewed updated changes
• All issues resolved
• All tests pass. No actionable issues remain

Fory-Specific AI Review

Scope
• Reviewed serialization and deserialization logic
• Verified typeInfo handling, struct embedding, and reference tracking
• Checked alignment with Java reference implementation
• Validated cross language compatibility and code generation
Result
• No actionable issues found

• Implementation aligns with Fory wire format and runtime behavior

Fixes #3280

@chaokunyang
Copy link
Copy Markdown
Collaborator

This is wrong. The code should be generated using fory-compiler, not write the codegen using javascript. And it's not about GRPC codegen

@miantalha45
Copy link
Copy Markdown
Contributor Author

My bad. Let me fix it.

Copilot AI review requested due to automatic review settings February 25, 2026 11:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds TypeScript code generation support to the Fory IDL compiler, enabling conversion of FDL (Fory Definition Language) schema files into type-safe TypeScript definitions. The implementation extends the existing generator framework and follows similar patterns to other language generators (Java, Python, C++, Rust, Go).

Changes:

  • Added TypeScriptGenerator class that generates pure TypeScript interfaces, enums, and discriminated unions from FDL schemas
  • Integrated TypeScript generation into CLI with --typescript_out flag
  • Added comprehensive test suite with 12 tests covering messages, enums, unions, primitives, collections, and file structure

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
compiler/fory_compiler/generators/typescript.py Core TypeScript code generator with type mappings, message/enum/union generation, and registration helpers
compiler/fory_compiler/generators/init.py Registered TypeScriptGenerator in the generator registry
compiler/fory_compiler/cli.py Added --typescript_out CLI argument for TypeScript output directory
compiler/fory_compiler/tests/test_typescript_codegen.py Test suite covering enum/message/union generation, type mappings, nested types, and conventions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

assert "MOBILE = 0" in output
assert "HOME = 1" in output


Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test verifies that nested enums are exported at the top level, but it doesn't test whether the registration function correctly references them. Since the registration function uses qualified names like Person.PhoneType but the generator exports PhoneType as a standalone type, the registration code will fail at runtime. Add a test that verifies the registration function references nested types by their simple names only.

Suggested change
def test_typescript_nested_enum_registration_uses_simple_name():
"""Test that the registration function references nested enums by simple name."""
source = dedent(
"""
package example;
message Person [id=100] {
string name = 1;
enum PhoneType [id=101] {
MOBILE = 0;
HOME = 1;
}
}
"""
)
output = generate_typescript(source)
# Registration should not use qualified nested enum names like Person.PhoneType
assert "Person.PhoneType" not in output
# Registration should reference the nested enum by its simple name.
# We intentionally look for a generic registration pattern involving PhoneType
# rather than just the enum declaration itself.
assert "PhoneType" in output

Copilot uses AI. Check for mistakes.
Comment on lines +247 to +255
# Generate nested enums first
for nested_enum in message.nested_enums:
lines.extend(self.generate_enum(nested_enum, indent=indent))
lines.append("")

# Generate nested unions
for nested_union in message.nested_unions:
lines.extend(self.generate_union(nested_union, indent=indent))
lines.append("")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested enums are generated before their parent interface, which means they will appear in the output before the parent message. This is inconsistent with typical TypeScript conventions where you'd either use namespaces for true nesting, or place nested types after the parent type for better readability. Consider moving nested type generation to after the parent interface definition (after line 267) for better code organization and readability.

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +70
PrimitiveKind.FLOAT16: "number",
PrimitiveKind.FLOAT32: "number",
PrimitiveKind.FLOAT64: "number",
PrimitiveKind.STRING: "string",
PrimitiveKind.BYTES: "Uint8Array",
PrimitiveKind.DATE: "Date",
PrimitiveKind.TIMESTAMP: "Date",
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PRIMITIVE_MAP is missing mappings for three primitive types that exist in PrimitiveKind: BFLOAT16, DURATION, and DECIMAL. All other generators (Java, Go, etc.) provide mappings for these types. For TypeScript, these should map to:

  • BFLOAT16: "number" (same as FLOAT16)
  • DURATION: likely a duration representation type or number for milliseconds
  • DECIMAL: "number" or a string representation for precision

Without these mappings, any FDL schema using these types will fall back to "any", which defeats the purpose of type-safe code generation.

Suggested change
PrimitiveKind.FLOAT16: "number",
PrimitiveKind.FLOAT32: "number",
PrimitiveKind.FLOAT64: "number",
PrimitiveKind.STRING: "string",
PrimitiveKind.BYTES: "Uint8Array",
PrimitiveKind.DATE: "Date",
PrimitiveKind.TIMESTAMP: "Date",
PrimitiveKind.FLOAT16: "number",
PrimitiveKind.BFLOAT16: "number",
PrimitiveKind.FLOAT32: "number",
PrimitiveKind.FLOAT64: "number",
PrimitiveKind.STRING: "string",
PrimitiveKind.BYTES: "Uint8Array",
PrimitiveKind.DATE: "Date",
PrimitiveKind.TIMESTAMP: "Date",
PrimitiveKind.DURATION: "number",
PrimitiveKind.DECIMAL: "number",

Copilot uses AI. Check for mistakes.
imported.append(item)
else:
local.append(item)
return local, imported # Return (local, imported) tuple
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split_imported_types method returns (local, imported) but all other generators in the codebase (cpp.py, go.py, java.py, python.py, rust.py) return (imported, local) in that order. This inconsistency could lead to bugs where the wrong list is used. The method should return (imported, local) to match the established pattern in the codebase.

Suggested change
return local, imported # Return (local, imported) tuple
return imported, local # Return (imported, local) tuple

Copilot uses AI. Check for mistakes.
Comment on lines +282 to +284
assert "first_name:" in output or "firstName:" in output
assert "last_name:" in output or "lastName:" in output
assert "phone_number:" in output or "phoneNumber:" in output
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assertions use 'or' to accept either snake_case or camelCase, which defeats the purpose of testing camelCase conversion. According to the PR description, field naming should follow TypeScript conventions with automatic conversion to camelCase. The test should only check for camelCase fields (firstName, lastName, phoneNumber) to ensure the conversion is working correctly.

Suggested change
assert "first_name:" in output or "firstName:" in output
assert "last_name:" in output or "lastName:" in output
assert "phone_number:" in output or "phoneNumber:" in output
assert "firstName:" in output
assert "lastName:" in output
assert "phoneNumber:" in output

Copilot uses AI. Check for mistakes.
Comment on lines +353 to +382
def _generate_message_registration(
self, message: Message, lines: List[str], parent: Optional[str] = None
):
"""Generate registration for a message and its nested types."""
qual_name = f"{parent}.{message.name}" if parent else message.name

# Register nested enums
for nested_enum in message.nested_enums:
if self.should_register_by_id(nested_enum):
type_id = nested_enum.type_id
lines.append(
f" fory.register({qual_name}.{nested_enum.name}, {type_id});"
)

# Register nested unions
for nested_union in message.nested_unions:
if self.should_register_by_id(nested_union):
type_id = nested_union.type_id
lines.append(
f" fory.registerUnion({qual_name}.{nested_union.name}, {type_id});"
)

# Register nested messages
for nested_msg in message.nested_messages:
self._generate_message_registration(nested_msg, lines, qual_name)

# Register the message itself
if self.should_register_by_id(message):
type_id = message.type_id
lines.append(f" fory.register({qual_name}, {type_id});")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registration function generates code that attempts to access nested types using qualified names like Parent.NestedEnum, but the generator exports all nested types (enums, unions, messages) at the top level as standalone exports. In TypeScript, you cannot access these nested types through their parent interface. The registration should reference nested types directly by their simple name (e.g., NestedEnum instead of Parent.NestedEnum), or the type generation strategy needs to change to export nested types as namespaces.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +328 to +333
# Should not reference gRPC
assert "@grpc" not in output
assert "grpc-js" not in output
assert "require('grpc" not in output
assert "import.*grpc" not in output

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test intends to ensure no gRPC imports, but assert "import.*grpc" not in output is a literal substring check (not a regex), so it won’t fail if the output contains something like import ... grpc. Use re.search(r"import.*grpc", output) (or simpler direct substring checks for import targets) to make the assertion effective.

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +149
else:
# Check if it matches any primitive kind directly
for primitive_kind, ts_type in self.PRIMITIVE_MAP.items():
if primitive_kind.value == primitive_name:
type_str = ts_type
break
if not type_str:
# If not a primitive, treat as a message/enum type
type_str = self.to_pascal_case(field_type.name)
elif isinstance(field_type, ListType):
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate_type() doesn’t handle qualified type references like Parent.Child. The FDL parser allows dotted names, but to_pascal_case() will leave the dot in place, producing an invalid TypeScript identifier/reference. Consider resolving nested/qualified names (e.g., by flattening or mapping Parent.Child to the generated symbol name) consistently with how nested types are emitted.

Copilot uses AI. Check for mistakes.
# If not a primitive, treat as a message/enum type
type_str = self.to_pascal_case(field_type.name)
elif isinstance(field_type, ListType):
element_type = self.generate_type(field_type.element_type)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ListType, element_optional isn’t reflected in the generated TypeScript type. Currently list<optional T> / repeated optional T will still emit T[] instead of something like (T | undefined)[], which is a type mismatch.

Suggested change
element_type = self.generate_type(field_type.element_type)
# Respect optionality of list elements, if available on the IR node.
element_nullable = getattr(field_type, "element_optional", False)
element_type = self.generate_type(field_type.element_type, nullable=element_nullable)
# If the element type is a union (e.g., "T | undefined"), wrap it so
# the array applies to the whole union: (T | undefined)[]
if " | " in element_type:
element_type = f"({element_type})"

Copilot uses AI. Check for mistakes.
elif isinstance(field_type, MapType):
key_type = self.generate_type(field_type.key_type)
value_type = self.generate_type(field_type.value_type)
type_str = f"Record<{key_type}, {value_type}>"
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MapType is emitted as Record<keyType, valueType>, but Record<K, V> requires K extends string | number | symbol. Some valid FDL schemas (e.g., map<int64, ...>) will map to Record<bigint | number, ...> which won’t type-check. Consider normalizing map keys to string (or emitting Map<K, V> / constraining supported key kinds).

Suggested change
type_str = f"Record<{key_type}, {value_type}>"
# Use Record only for key types allowed by TypeScript's Record<K, V>
if key_type in ("string", "number", "symbol"):
type_str = f"Record<{key_type}, {value_type}>"
else:
# Fallback to Map<K, V> for other key types (e.g., bigint)
type_str = f"Map<{key_type}, {value_type}>"

Copilot uses AI. Check for mistakes.
Comment on lines +330 to +350
# Register enums
for enum in self.schema.enums:
if self.is_imported_type(enum):
continue
if self.should_register_by_id(enum):
type_id = enum.type_id
lines.append(f" fory.register({enum.name}, {type_id});")

# Register messages
for message in self.schema.messages:
if self.is_imported_type(message):
continue
self._generate_message_registration(message, lines)

# Register unions
for union in self.schema.unions:
if self.is_imported_type(union):
continue
if self.should_register_by_id(union):
type_id = union.type_id
lines.append(f" fory.registerUnion({union.name}, {type_id});")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate_registration() only registers types when should_register_by_id() is true; types without an explicit/generated type_id will be silently skipped, unlike other generators which fall back to namespace/type-name registration. If TS registration is intended to support non-ID schemas, add the named-registration path (or make the omission explicit).

Copilot uses AI. Check for mistakes.
@chaokunyang chaokunyang changed the title feat(compiler): Add JavaScript/TypeScript IDL code generation for gRPC feat(compiler): Add JavaScript/TypeScript IDL code generation Feb 25, 2026
@chaokunyang
Copy link
Copy Markdown
Collaborator

The implementation should be something like #3406

@miantalha45
Copy link
Copy Markdown
Contributor Author

sure @chaokunyang
Let me check it and make changes accordingly.

Generate TypeScript type definitions and interfaces from FDL IDL for usage with serialization. Features:

- Type-safe message interfaces, enums, and discriminated unions
- Compatible with both TypeScript and JavaScript
- Type ID registration helpers
- No runtime dependencies (gRPC-free type definitions)
- Proper package name handling and conversion to module names
- Support via --typescript_out CLI flag
@miantalha45 miantalha45 force-pushed the generate-messageTypes-from-IDL branch from c6f379c to 92cd483 Compare February 26, 2026 07:23
@miantalha45
Copy link
Copy Markdown
Contributor Author

@chaokunyang I have tried to make implementation according to #3406 . Please have a look on it now and tell me if anything require any change.

used_field_names: Set[str] = set()
for field in message.fields:
field_name = self._field_member_name(field, message, used_field_names)
field_type = self.generate_type(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This emits field?: T | undefined for optional fields, but the JS struct reader initializes every property to null and leaves absent nullable fields as null after deserialization. That means a round-tripped object can immediately violate the generated interface even when the runtime behaves as designed. The generated type should include null so it matches the values the JS runtime actually materializes on read.

nullable=False,
parent_stack=parent_stack,
)
if key_type in ("string", "number"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For map fields keyed by string or number, the generated TypeScript type becomes Record<..., ...>, but the emitted registration helper still uses Type.map(...) and the JS runtime serializer only accepts real Map instances. MapAnySerializer.write() calls value.size and iterates value.entries(), so a value that matches the generated Record<string, V> type will fail at runtime with value.entries is not a function. The generated field type needs to match the runtime contract here and be emitted as Map<K, V>.

@miantalha45
Copy link
Copy Markdown
Contributor Author

@chaokunyang I have addressed all your comments.

@chaokunyang
Copy link
Copy Markdown
Collaborator

chaokunyang commented Apr 8, 2026

@miantalha45 Please use claude/codex/gemini AI to review your pr and address the comments first, and repeat the loop until ai don't give further comments. This pr still have lots of issues. Each time I review it, I just find new issues.

const cases = typeInfo.options?.cases;
if (cases) {
const caseEntries: string[] = [];
for (const [caseIdx, caseTypeInfo] of Object.entries(cases)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case table only keeps typeId and userTypeId, but the new union API also accepts named case types like Type.struct({ namespace, typeName }) and Type.union({ ... }). In that path write() later calls getSerializerById(NAMED_STRUCT | NAMED_UNION, -1), even though named serializers are registered under namespace$typeName, so serialization crashes as soon as a named case is encountered.


field_parent_stack = (parent_stack or []) + [type_def]
props_parts: List[str] = []
for field in type_def.fields:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface generator already de-duplicates JS member names with _field_member_name(), but this helper falls back to safe_member_name(). If two IDL fields normalize to the same JS name, like foo_bar and fooBar, the interface emits distinct members such as fooBar and fooBar1 while the generated Type.struct(...) literal still contains fooBar twice and silently drops one field. Registration needs to reuse the same deduplicated field-name mapping as interface emission.

return f"Type.struct({resolved.type_id})"
else:
return f"Type.struct({{ typeId: {resolved.type_id}, evolving: false }})"
ns = self.schema.package or "default"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this field points at an imported message without a numeric type ID, it uses the current schema package instead of the imported message's package. Generating JavaScript for integration_tests/idl_tests/idl/root.idl emits Type.struct("root.TreeNode") here, but tree.fdl registers tree.TreeNode, so registerRootTypes resolves a serializer that was never registered and any TreeNode value fails at runtime.

@chaokunyang
Copy link
Copy Markdown
Collaborator

@miantalha45 Please provide ai review results based on our ai policy: https://github.com/apache/fory/blob/main/AI_POLICY.md#5-verification-requirements

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +122 to +132
getFixedSize(): number {
return 12;
}

getTypeId() {
return TypeId.TYPED_UNION;
}
}

CodegenRegistry.register(TypeId.TYPED_UNION, UnionSerializerGenerator);
CodegenRegistry.register(TypeId.NAMED_UNION, UnionSerializerGenerator);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnionSerializerGenerator is registered for both TypeId.TYPED_UNION and TypeId.NAMED_UNION, but getTypeId() always returns TypeId.TYPED_UNION. This will cause named-union serializers to emit the wrong typeId in type info (and potentially try to write/read a userTypeId), breaking AnyHelper.detectSerializer for TypeId.NAMED_UNION. getTypeId() should reflect the underlying typeInfo.typeId (and NAMED_UNION likely also needs custom writeTypeInfo/readTypeInfo to write/skip namespace+typeName or TypeMeta like other named types).

Copilot uses AI. Check for mistakes.
Comment on lines +687 to +710
union(idOrCases?: number | { namespace?: string; typeName?: string } | { [caseIndex: number]: TypeInfo }, cases?: { [caseIndex: number]: TypeInfo }) {
let typeInfo: TypeInfo;
if (typeof idOrCases === "number") {
typeInfo = new TypeInfo<typeof TypeId.TYPED_UNION>(TypeId.TYPED_UNION);
typeInfo.userTypeId = idOrCases;
if (cases) {
typeInfo.options = { cases };
}
} else if (idOrCases && ("namespace" in idOrCases || "typeName" in idOrCases)) {
const nameInfo = idOrCases as { namespace?: string; typeName?: string };
typeInfo = new TypeInfo<typeof TypeId.NAMED_UNION>(TypeId.NAMED_UNION);
typeInfo.namespace = nameInfo.namespace || "";
typeInfo.typeName = nameInfo.typeName || "";
typeInfo.named = `${typeInfo.namespace}$${typeInfo.typeName}`;
if (cases) {
typeInfo.options = { cases };
}
} else {
typeInfo = new TypeInfo<typeof TypeId.TYPED_UNION>(TypeId.TYPED_UNION);
if (idOrCases) {
typeInfo.options = { cases: idOrCases as { [caseIndex: number]: TypeInfo } };
}
}
return typeInfo;
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type.union() creates a TypeId.TYPED_UNION TypeInfo even when no numeric union id is provided (cases-only overload). That leaves userTypeId at -1, but TYPED_UNION is in TypeId.needsUserTypeId, so any path that writes type info for this TypeInfo will encode an invalid userTypeId (effectively 0xFFFFFFFF). Consider using TypeId.UNION for the cases-only overload (and registering a serializer for TypeId.UNION), or require an explicit numeric id for TYPED_UNION so userTypeId is always valid.

Copilot uses AI. Check for mistakes.
Comment on lines 298 to 303
private isAny() {
return this.typeInfo.options?.key!.typeId === TypeId.UNKNOWN || this.typeInfo.options?.value!.typeId === TypeId.UNKNOWN || !this.typeInfo.options?.key?.isMonomorphic() || !this.typeInfo.options?.value?.isMonomorphic();
const keyTypeId = this.typeInfo.options?.key!.typeId;
const valueTypeId = this.typeInfo.options?.value!.typeId;
return keyTypeId === TypeId.UNKNOWN || valueTypeId === TypeId.UNKNOWN
|| !TypeId.isBuiltin(keyTypeId!) || !TypeId.isBuiltin(valueTypeId!);
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MapSerializerGenerator.isAny() now treats any non-builtin key/value type as “any”, forcing the MapAnySerializer path even when the key/value TypeInfo is monomorphic and declared (e.g., struct/enum). This bypasses the specialized codegen path in writeSpecificType/readSpecificType and can be a significant performance regression for common maps of user-defined types. If correctness doesn’t require the fallback, consider restoring the previous monomorphism-based check (or otherwise allowing declared monomorphic user-defined types to use the specialized path).

Copilot uses AI. Check for mistakes.
Comment on lines 24 to 38
@@ -33,6 +34,7 @@
"rust": RustGenerator,
"go": GoGenerator,
"csharp": CSharpGenerator,
"javascript": JavaScriptGenerator,
"swift": SwiftGenerator,
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description references a new compiler/fory_compiler/generators/typescript.py and a TypeScriptGenerator, but the implementation added in this PR is compiler/fory_compiler/generators/javascript.py with JavaScriptGenerator registered under the javascript language. Please update the PR description (or naming) to match the actual generator/module names to avoid confusion for reviewers and users.

Copilot uses AI. Check for mistakes.
Comment on lines 330 to 417
readEmbed() {
const serializerExpr = this.serializerExpr;
const scope = this.scope;
const builder = this.builder;
return new Proxy({}, {
get: (target, prop: string) => {
if (prop === "readNoRef") {
return (accessor: (expr: string) => string, refState: string) => {
const result = scope.uniqueName("result");
return `
${serializerExpr}.readTypeInfo();
fory.incReadDepth();
let ${result} = ${serializerExpr}.read(${refState});
fory.decReadDepth();
${accessor(result)};
`;
};
}
if (prop === "readRef") {
return (accessor: (expr: string) => string) => {
const refFlag = scope.uniqueName("refFlag");
const result = scope.uniqueName("result");
return `
const ${refFlag} = ${builder.reader.readInt8()};
let ${result};
if (${refFlag} === ${RefFlags.NullFlag}) {
${result} = null;
} else if (${refFlag} === ${RefFlags.RefFlag}) {
${result} = ${builder.referenceResolver.getReadObject(builder.reader.readVarUInt32())};
} else {
${serializerExpr}.readTypeInfo();
fory.incReadDepth();
${result} = ${serializerExpr}.read(${refFlag} === ${RefFlags.RefValueFlag});
fory.decReadDepth();
}
${accessor(result)};
`;
};
}
if (prop === "readWithDepth") {
return (accessor: (expr: string) => string, refState: string) => {
const result = scope.uniqueName("result");
return `
fory.incReadDepth();
let ${result} = ${serializerExpr}.read(${refState});
fory.decReadDepth();
${accessor(result)};
`;
};
}
return (accessor: (expr: string) => string, ...args: string[]) => {
const name = this.scope.declare(
"tag_ser",
TypeId.isNamedType(this.typeInfo.typeId)
? this.builder.typeResolver.getSerializerByName(CodecBuilder.replaceBackslashAndQuote(this.typeInfo.named!))
: this.builder.typeResolver.getSerializerById(this.typeInfo.typeId, this.typeInfo.userTypeId)
);
return accessor(`${name}.${prop}(${args.join(",")})`);
return accessor(`${serializerExpr}.${prop}(${args.join(",")})`);
};
},
});
}

writeEmbed() {
const serializerExpr = this.serializerExpr;
const scope = this.scope;
return new Proxy({}, {
get: (target, prop: string) => {
if (prop === "writeNoRef") {
return (accessor: string) => {
return `
${serializerExpr}.writeTypeInfo(${accessor});
${serializerExpr}.write(${accessor});
`;
};
}
if (prop === "writeRef") {
return (accessor: string) => {
const noneedWrite = scope.uniqueName("noneedWrite");
return `
let ${noneedWrite} = ${serializerExpr}.writeRefOrNull(${accessor});
if (!${noneedWrite}) {
${serializerExpr}.writeTypeInfo(${accessor});
${serializerExpr}.write(${accessor});
}
`;
};
}
return (accessor: string, ...args: any) => {
const name = this.scope.declare(
"tag_ser",
TypeId.isNamedType(this.typeInfo.typeId)
? this.builder.typeResolver.getSerializerByName(CodecBuilder.replaceBackslashAndQuote(this.typeInfo.named!))
: this.builder.typeResolver.getSerializerById(this.typeInfo.typeId, this.typeInfo.userTypeId)
);
if (prop === "writeRefOrNull") {
return args[0](`${name}.${prop}(${accessor})`);
return args[0](`${serializerExpr}.${prop}(${accessor})`);
}
return `${name}.${prop}(${accessor})`;
return `${serializerExpr}.${prop}(${accessor})`;
};
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readEmbed()/writeEmbed() now inline fory.typeResolver.getSerializerById/getSerializerByName via serializerExpr each time a nested struct is read/written (including inside loops for collections). Previously this was cached via scope.declare(...) so the serializer resolution happened once per generated serializer. Consider caching serializerExpr into a declared variable to avoid repeated typeResolver lookups in hot paths while still using the correct (own) TypeInfo/serializer for nested structs.

Copilot uses AI. Check for mistakes.
@miantalha45
Copy link
Copy Markdown
Contributor Author

@miantalha45 Please provide ai review results based on our ai policy: https://github.com/apache/fory/blob/main/AI_POLICY.md#5-verification-requirements

@chaokunyang I have added AI Review in PR description.

}

getTypeId() {
return TypeId.TYPED_UNION;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaScriptGenerator emits Type.union({ namespace, typeName }, ...) for unions without numeric IDs, but this serializer generator always reports TypeId.TYPED_UNION. Root serialization of such a union therefore writes type id 34 plus a bogus 0xffffffff user type id, and deserialization then fails in AnyHelper.detectSerializer() instead of resolving the named union by name. getTypeId() needs to reflect this.typeInfo.typeId here.

PrimitiveKind.DATE: "Type.date()",
PrimitiveKind.TIMESTAMP: "Type.timestamp()",
PrimitiveKind.DURATION: "Type.duration()",
# DECIMAL is not yet supported by the JS runtime; omitted intentionally.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated TypeScript surface says decimal becomes bigint, but the runtime map intentionally omits DECIMAL, so _field_type_expr() falls back to Type.any(). Generating message Money { decimal amount = 1; } currently emits amount: bigint; plus amount: Type.any().setId(1), which means runtime serialization uses data-dependent any semantics instead of the schema's decimal type. This generator should either reject decimal for JavaScript until the runtime supports it, or add real decimal support end to end.

@chaokunyang
Copy link
Copy Markdown
Collaborator

chaokunyang commented Apr 10, 2026

Due to continious issue in this pr. I will close it directly. Please do not open a pull request for this feature agian.

@miantalha45
Copy link
Copy Markdown
Contributor Author

@chaokunyang I understand there were continuous issues and I respect your decision to close the PR.
Would you be open to giving me one final chance to address everything properly? I will do a thorough rework using Claude Code and stricter validation against the Java reference to ensure correctness.
If not, I completely understand and will respect your decision.

@chaokunyang
Copy link
Copy Markdown
Collaborator

Ok, let's give it one last try

@chaokunyang chaokunyang reopened this Apr 10, 2026
@miantalha45
Copy link
Copy Markdown
Contributor Author

Ok, let's give it one last try

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Compiler][JavaScript] Generate message/enum/union types from IDL

3 participants