feat(core): Introduce Attribute-Carrying Language-Agnostic Enums#554
feat(core): Introduce Attribute-Carrying Language-Agnostic Enums#554junrushao merged 2 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds FFI-compatible enum support via @py_enum and @c_enum decorators, allowing for singleton variants and per-variant attribute maps. Key changes involve implementing EnumObject and EnumAttrMap in Python, updating Cython FFI registration, and allowing attribute overwrites in C++. Review feedback identifies a potential reference counting bug in the Cython layer, suggests optimizing variant name lookups to O(1), and recommends alphabetical sorting for exports and more descriptive test naming.
772fe2f to
7c5ecdc
Compare
7c5ecdc to
1861928
Compare
1861928 to
ee7d757
Compare
…cked entries
Architecture:
- Introduce a dedicated ``tvm_ffi.dataclasses.enum`` module plus two new
public C++ headers that jointly define a single cross-language enum
abstraction. The Python and C++ halves agree on a fixed pair of
TypeAttr columns and on the on-the-wire representation of each
registry, so distributed entry registration (one TU in C++, plus one
or more Python subclasses) converges on the same live containers.
- The C++ half lives in ``include/tvm/ffi/enum.h``: a concrete
``EnumObj`` (fields ``int64_t value`` + ``String name``, both
reflected via ``def_ro``) plus its nullable ObjectRef ``Enum``.
``EnumObj`` is registered under the type key ``ffi.Enum`` and is the
root of every user-defined enum class tree.
- Entry registration is driven by a new builder in
``include/tvm/ffi/reflection/enum_def.h``:
``refl::EnumDef<EnumClsObj>("Name").set_attr("key", value)...``. Each
call allocates a fresh dense ordinal (``= len(entries)``), constructs
the instance, and writes it into a per-type ``__ffi_enum_entries__``
TypeAttr column storing ``Dict<String, Enum>``.
- ``EnumDef`` follows a strict "register-once-then-mutate" protocol:
on the first call per type it registers the mutable ``Dict`` via
``TVMFFITypeRegisterAttr``; subsequent calls look up the existing
``Dict`` via ``TVMFFIGetTypeAttrColumn`` and mutate it in place. The
same protocol governs the per-class attribute store
(``__ffi_enum_attrs__``: ``Dict<String, List<Any>>``), where each
attribute is a list indexed by ordinal, padded with ``None`` through
the ordinal before the write.
- The Python half (``python/tvm_ffi/dataclasses/enum.py``) exposes an
``Enum`` base class registered at the same ``ffi.Enum`` key.
Subclasses declare their FFI type via parameterised inheritance:
``class Foo(Enum, type_key="..."):``. ``Enum.__init_subclass__``
auto-detects whether ``type_key`` is already in the FFI type system
and routes the class through ``@c_class`` (C++-backed) or
``@py_class`` (fresh Python-only) accordingly — no separate
``py_enum``/``c_enum`` opt-in exists.
- Variants are declared in the class body in exactly two shapes, both
of which go through ``__init_subclass__`` scanning and materialise
singletons cached as class attributes (guaranteeing
``Cls.FOO is Cls.FOO``):
1. ``name: ClassVar[Cls] = entry(field1=..., field2=...)`` —
registers a Python-side entry and forwards the captured kwargs
to the subclass's ``__init__`` as user-declared fields.
2. ``name: ClassVar[Cls]`` (no assignment) — binds to a pre-existing
entry with the same ``name`` from the ``__ffi_enum_entries__``
column (typically registered in C++ via ``refl::EnumDef``), or,
if none exists, registers a blank Python entry.
- Both ``value`` (dense ordinal in declaration order) and ``name``
(declaration key) are auto-populated on every entry; they are never
user-supplied. ``entry(value=...)`` and ``entry(name=...)`` raise
``TypeError`` at class-body time.
- Per-class class-level reflection surface (``by_name``, ``by_value``,
``attr_dict``) is exposed through a new ``_ClassProperty`` descriptor
— a minimal getter descriptor that receives the owning class, letting
class-level attribute access work without a metaclass.
Public Interfaces:
- New C++ public headers:
* ``include/tvm/ffi/enum.h`` — ``EnumObj``/``Enum`` + two
column-name string constants ``kEnumEntriesAttrName`` (=
``"__ffi_enum_entries__"``) and ``kEnumAttrsAttrName`` (=
``"__ffi_enum_attrs__"``). Both constants are the source of truth
for the column names; Python mirrors them as
``ENUM_ENTRIES_ATTR`` / ``ENUM_ATTRS_ATTR``.
* ``include/tvm/ffi/reflection/enum_def.h`` — ``refl::EnumDef<Obj>``
builder with ``.set_attr(name, value)`` chaining and getters for
``instance()`` / ``ordinal()``.
- ``include/tvm/ffi/tvm_ffi.h`` now transitively includes both new
headers, so consumers of the aggregate header get the enum API
without extra work.
- New Python symbols exported from ``tvm_ffi.dataclasses``:
``Enum``, ``EnumAttrMap``, ``entry``. Class surface on subclasses:
``Cls.get(name)`` / ``Cls.entries()`` / ``Cls.def_attr(name,
*, default=...)`` + three class-level property views ``Cls.by_name``
(``Dict[str, Enum]``), ``Cls.by_value`` (``List[Enum]`` indexed by
ordinal), ``Cls.attr_dict`` (``Dict[str, List[Any]]``).
- ``EnumAttrMap`` round-trips through the unified
``__ffi_enum_attrs__`` column instead of per-attribute columns; a
single Dict column is shared across all attribute names on a given
class, and each value is a List indexed by variant ordinal.
- ``@dataclass_transform(field_specifiers=(Field, field, entry))`` on
``Enum`` lets ``ClassVar[Cls] = entry(...)`` type-check as a proper
field-specifier pattern under typing-aware tools.
- C++ test-support type ``testing.TestEnumVariant`` (in
``src/ffi/testing/testing.cc``) now extends ``EnumObj`` rather than a
bare ``Object``; it registers two entries ``Alpha``/``Beta`` via
``refl::EnumDef`` with a ``code`` attribute. This is the canonical
end-to-end demonstration of the builder.
UI/UX:
- none (library-only change; no CLI, REPL, or user-visible UI surface).
Behavioral Changes:
- ``TVMFFITypeRegisterAttr`` again raises ``RuntimeError`` on duplicate
``(type_index, attr_name)`` writes. This is a reversal of the
previously-relaxed "silent overwrite" behaviour: the invariant that
a TypeAttr slot is registered once is restored and is now load-
bearing for the ``EnumDef`` register-once-then-mutate protocol. The
thrown message tells callers to register a mutable container
(``Dict``/``List``) once and mutate it in place on subsequent calls
— exactly the pattern the enum builders use internally.
- Distributed enum-entry registration — whether across TUs or across
Python subclasses re-binding the same type key — now converges on
shared ``Dict``/``List`` containers. There is no "last writer wins"
semantics; instead, each registrant atomically appends to the shared
live containers, and duplicate ``(type, instance_name)`` collisions
are explicit ``RuntimeError``s at ``EnumDef`` construction.
- Variant declaration forms were narrowed relative to the earlier draft
of this feature. Bare-int sugar (``ok = 0`` auto-promotion), the
simple-integer form ``entry(0)``, and the implicit
``value: int`` field synthesis are no longer supported. ``value``
is always auto-assigned as the dense ordinal and is now a concrete
read-only field on ``EnumObj`` itself. This is a design revision of
work that has not yet shipped to users (the PR is still open), not a
breaking change to a released API.
Docs:
- No user-facing RST/Markdown doc page updated in this change. The
``dataclass_reflection.rst`` toctree entries for ``py_class`` /
``c_class`` are still commented out, so no sibling ``Enum`` section
is authored yet. The two declaration forms, the auto-assigned
``value``/``name`` fields, and the ``by_name``/``by_value``/
``attr_dict`` class-level views are fully documented in the ``Enum``
docstring and the new C++ header Doxygen comments. Follow-up:
publish a unified dataclass + enum reference once the broader
dataclass doc lands.
Tests:
- Executed: ``uv run pytest tests/python/test_dataclass_enum.py`` —
24/24 passing. Covers both declaration forms (Python-side
``ClassVar = entry(...)`` and bare-``ClassVar[Cls]`` binding),
explicit rejection of ``entry(value=...)`` / ``entry(name=...)``,
auto-ordinal assignment, frozen-singleton identity, ``Cls.get`` /
``Cls.entries`` / ``Cls.by_name`` / ``Cls.by_value`` /
``Cls.attr_dict`` surface, ``def_attr`` round-trips through the
unified ``__ffi_enum_attrs__`` column (including missing-default
behaviour, ``__contains__``, cross-enum foreign-variant rejection,
and fresh-wrapper lookup via ``Cls.get``), direct TypeAttr column
verification for both ``__ffi_enum_entries__`` and
``__ffi_enum_attrs__``, and the C++-backed happy path against
``testing.TestEnumVariant``'s ``refl::EnumDef``-registered
``Alpha``/``Beta`` entries with their ``code`` attribute.
- Executed: ``pre-commit run --files <all staged>`` — all hooks pass
(ASF header, file types, end-of-file / trailing whitespace, ruff
check/format, ty, clang-format).
Untested Edge Cases:
- C++ GoogleTest suite (``tests/cpp/``) was not re-run. The C++ delta
touches: (a) restoring ``RegisterAttr``'s duplicate-throw — for which
no existing test asserts either the throw or the silent-overwrite
behaviour being reverted; (b) introducing the ``EnumObj``/``Enum``
root and ``EnumDef`` builder, covered end-to-end by the Python
tests against ``testing.TestEnumVariant``; (c) the test-only
refactor of ``TestEnumVariant`` from a bare ``Object`` to an
``EnumObj`` subclass. Regression risk is low but formally
unverified.
- Rust test suite (``cargo test`` under ``rust/``) was not executed.
No Rust bindings touched; risk is low but untested.
- Cross-module TypeAttr convergence (two independently-loaded Python
modules registering entries under the same type key from different
processes / plugin-host isolation contexts) is exercised only
within a single process; multi-process scenarios remain uncovered.
Refs: apache#554
ee7d757 to
d880675
Compare
…cked entries
Architecture:
- Introduce a dedicated ``tvm_ffi.dataclasses.enum`` module plus two new
public C++ headers that jointly define a single cross-language enum
abstraction. The Python and C++ halves agree on a fixed pair of
TypeAttr columns and on the on-the-wire representation of each
registry, so distributed entry registration (one TU in C++, plus one
or more Python subclasses) converges on the same live containers.
- The C++ half lives in ``include/tvm/ffi/enum.h``: a concrete
``EnumObj`` (fields ``int64_t value`` + ``String name``, both
reflected via ``def_ro``) plus its nullable ObjectRef ``Enum``.
``EnumObj`` is registered under the type key ``ffi.Enum`` and is the
root of every user-defined enum class tree.
- Entry registration is driven by a new builder in
``include/tvm/ffi/reflection/enum_def.h``:
``refl::EnumDef<EnumClsObj>("Name").set_attr("key", value)...``. Each
call allocates a fresh dense ordinal (``= len(entries)``), constructs
the instance, and writes it into a per-type ``__ffi_enum_entries__``
TypeAttr column storing ``Dict<String, Enum>``.
- ``EnumDef`` follows a strict "register-once-then-mutate" protocol:
on the first call per type it registers the mutable ``Dict`` via
``TVMFFITypeRegisterAttr``; subsequent calls look up the existing
``Dict`` via ``TVMFFIGetTypeAttrColumn`` and mutate it in place. The
same protocol governs the per-class attribute store
(``__ffi_enum_attrs__``: ``Dict<String, List<Any>>``), where each
attribute is a list indexed by ordinal, padded with ``None`` through
the ordinal before the write.
- The Python half (``python/tvm_ffi/dataclasses/enum.py``) exposes an
``Enum`` base class registered at the same ``ffi.Enum`` key.
Subclasses declare their FFI type via parameterised inheritance:
``class Foo(Enum, type_key="..."):``. ``Enum.__init_subclass__``
auto-detects whether ``type_key`` is already in the FFI type system
and routes the class through ``@c_class`` (C++-backed) or
``@py_class`` (fresh Python-only) accordingly — no separate
``py_enum``/``c_enum`` opt-in exists.
- Variants are declared in the class body in three shapes, all of
which go through ``__init_subclass__`` scanning and materialise
singletons cached as class attributes (guaranteeing
``Cls.FOO is Cls.FOO``):
1. ``name: ClassVar[Cls]`` (no assignment) — binds to a pre-existing
entry with the same ``name`` from the ``__ffi_enum_entries__``
column (typically registered in C++ via ``refl::EnumDef``), or,
if none exists, registers a blank Python entry.
2. ``name: ClassVar[Cls] = entry(field1=..., field2=...)`` —
registers a Python-side entry and forwards the captured kwargs
to the subclass's ``__init__`` as user-declared fields.
3. ``name = auto()`` / ``name: ClassVar[Cls] = auto()`` —
registers a Python-side entry that carries no user-declared
fields beyond the auto-assigned ``value``/``name``. Semantically
equivalent to ``entry()`` with no arguments, but spelled with a
dedicated helper to keep simple enum bodies uncluttered and to
give users a discoverable alternative to stdlib-style int sugar
(which this module deliberately rejects — see Behavioral
Changes).
- Both ``value`` (dense ordinal in declaration order) and ``name``
(declaration key) are auto-populated on every entry; they are never
user-supplied. ``entry(value=...)`` and ``entry(name=...)`` raise
``TypeError`` at class-body time.
- Per-class class-level reflection surface (``by_name``, ``by_value``,
``attr_dict``) is exposed through a new ``_ClassProperty`` descriptor
— a minimal getter descriptor that receives the owning class, letting
class-level attribute access work without a metaclass.
Public Interfaces:
- New C++ public headers:
* ``include/tvm/ffi/enum.h`` — ``EnumObj``/``Enum`` + two
column-name string constants ``kEnumEntriesAttrName`` (=
``"__ffi_enum_entries__"``) and ``kEnumAttrsAttrName`` (=
``"__ffi_enum_attrs__"``). Both constants are the source of truth
for the column names; Python mirrors them as
``ENUM_ENTRIES_ATTR`` / ``ENUM_ATTRS_ATTR``.
* ``include/tvm/ffi/reflection/enum_def.h`` — ``refl::EnumDef<Obj>``
builder with ``.set_attr(name, value)`` chaining and getters for
``instance()`` / ``ordinal()``.
- ``include/tvm/ffi/tvm_ffi.h`` now transitively includes both new
headers, so consumers of the aggregate header get the enum API
without extra work.
- New Python symbols exported from ``tvm_ffi.dataclasses``:
``Enum``, ``EnumAttrMap``, ``entry``, ``auto``. ``auto`` is a
zero-arg helper returning the same ``_EnumEntry`` sentinel as
``entry()``; it is listed alongside ``entry``/``field``/``Field`` in
``@dataclass_transform(field_specifiers=...)`` so that
``name = auto()`` and ``name: ClassVar[Cls] = auto()`` both
type-check as field-specifier patterns. Class surface on subclasses:
``Cls.get(name)`` / ``Cls.entries()`` / ``Cls.def_attr(name,
*, default=...)`` + three class-level property views ``Cls.by_name``
(``Dict[str, Enum]``), ``Cls.by_value`` (``List[Enum]`` indexed by
ordinal), ``Cls.attr_dict`` (``Dict[str, List[Any]]``).
- ``EnumAttrMap`` round-trips through the unified
``__ffi_enum_attrs__`` column instead of per-attribute columns; a
single Dict column is shared across all attribute names on a given
class, and each value is a List indexed by variant ordinal.
- ``@dataclass_transform(field_specifiers=(Field, field, entry, auto))``
on ``Enum`` lets ``ClassVar[Cls] = entry(...)`` and
``name = auto()`` type-check as proper field-specifier patterns
under typing-aware tools.
- C++ test-support type ``testing.TestEnumVariant`` (in
``src/ffi/testing/testing.cc``) now extends ``EnumObj`` rather than a
bare ``Object``; it registers two entries ``Alpha``/``Beta`` via
``refl::EnumDef`` with a ``code`` attribute. This is the canonical
end-to-end demonstration of the builder.
UI/UX:
- none (library-only change; no CLI, REPL, or user-visible UI surface).
Behavioral Changes:
- ``TVMFFITypeRegisterAttr`` again raises ``RuntimeError`` on duplicate
``(type_index, attr_name)`` writes. This is a reversal of the
previously-relaxed "silent overwrite" behaviour: the invariant that
a TypeAttr slot is registered once is restored and is now load-
bearing for the ``EnumDef`` register-once-then-mutate protocol. The
thrown message tells callers to register a mutable container
(``Dict``/``List``) once and mutate it in place on subsequent calls
— exactly the pattern the enum builders use internally.
- Distributed enum-entry registration — whether across TUs or across
Python subclasses re-binding the same type key — now converges on
shared ``Dict``/``List`` containers. There is no "last writer wins"
semantics; instead, each registrant atomically appends to the shared
live containers, and duplicate ``(type, instance_name)`` collisions
are explicit ``RuntimeError``s at ``EnumDef`` construction.
- Variant declaration forms were narrowed relative to the earlier draft
of this feature. Bare-int sugar (``ok = 0`` auto-promotion), the
simple-integer form ``entry(0)``, and the implicit
``value: int`` field synthesis are no longer supported. ``value``
is always auto-assigned as the dense ordinal and is now a concrete
read-only field on ``EnumObj`` itself. This is a design revision of
work that has not yet shipped to users (the PR is still open), not a
breaking change to a released API.
- Integer-literal sugar (e.g. ``ok = 0``) is deliberately *not*
supported. The auto-ordinal policy owns the ``value`` slot, so a
user-supplied int would either silently duplicate that assignment
or conflict with it; both outcomes are worse than a hard
rejection. ``auto()`` is the intended replacement — it makes the
"no extra fields" intent explicit without fighting the auto-
ordinal contract. The ``entry()`` docstring cross-references
``auto()``, and the ``Enum`` docstring's "Declaration forms"
section calls out this rejection inline so discoverability does
not depend on following ``See Also`` links.
Docs:
- No user-facing RST/Markdown doc page updated in this change. The
``dataclass_reflection.rst`` toctree entries for ``py_class`` /
``c_class`` are still commented out, so no sibling ``Enum`` section
is authored yet. All three declaration forms (bare ``ClassVar``,
``entry(...)``, ``auto()``), the auto-assigned ``value``/``name``
fields, the explicit rejection of integer-literal sugar, and the
``by_name``/``by_value``/``attr_dict`` class-level views are fully
documented in the ``Enum``/``entry``/``auto`` docstrings and the
new C++ header Doxygen comments. Follow-up: publish a unified
dataclass + enum reference once the broader dataclass doc lands.
Tests:
- Executed: ``uv run pytest tests/python/test_dataclass_enum.py -q``
— 30/30 passing. Covers all three declaration forms (Python-side
``ClassVar = entry(...)``, bare-``ClassVar[Cls]`` binding, and the
new ``auto()`` helper in both annotated and bare-assignment shapes),
explicit rejection of ``entry(value=...)`` / ``entry(name=...)``,
auto-ordinal assignment, frozen-singleton identity, ``Cls.get`` /
``Cls.entries`` / ``Cls.by_name`` / ``Cls.by_value`` /
``Cls.attr_dict`` surface, ``def_attr`` round-trips through the
unified ``__ffi_enum_attrs__`` column (including missing-default
behaviour, ``__contains__``, cross-enum foreign-variant rejection,
and fresh-wrapper lookup via ``Cls.get``), direct TypeAttr column
verification for both ``__ffi_enum_entries__`` and
``__ffi_enum_attrs__``, the C++-backed happy path against
``testing.TestEnumVariant``'s ``refl::EnumDef``-registered
``Alpha``/``Beta`` entries with their ``code`` attribute, and — new
in this amend — six dedicated ``auto()`` tests:
``test_auto_basic_no_annotation``,
``test_auto_with_classvar_annotation``,
``test_auto_mixed_with_bare_classvar`` (verifies the bare-
``ClassVar`` binders come first in annotation order, then sentinel
entries in class-body order, with deterministic dense ordinals),
``test_auto_mixed_with_entry`` (composition with attribute-carrying
``entry(...)`` variants on the same class),
``test_auto_rejects_already_registered_name`` (asserts ``auto()``
is register-not-bind — colliding with a C++-registered entry name
raises), and ``test_auto_returns_fresh_sentinels`` (confirms each
call yields a distinct ``_EnumEntry`` with empty ``args``/``kwargs``).
- Executed: ``pre-commit run --files <all staged>`` — all hooks pass
(ASF header, file types, end-of-file / trailing whitespace, ruff
check/format, ty, clang-format).
Untested Edge Cases:
- C++ GoogleTest suite (``tests/cpp/``) was not re-run. The C++ delta
touches: (a) restoring ``RegisterAttr``'s duplicate-throw — for which
no existing test asserts either the throw or the silent-overwrite
behaviour being reverted; (b) introducing the ``EnumObj``/``Enum``
root and ``EnumDef`` builder, covered end-to-end by the Python
tests against ``testing.TestEnumVariant``; (c) the test-only
refactor of ``TestEnumVariant`` from a bare ``Object`` to an
``EnumObj`` subclass. Regression risk is low but formally
unverified.
- Rust test suite (``cargo test`` under ``rust/``) was not executed.
No Rust bindings touched; risk is low but untested.
- Cross-module TypeAttr convergence (two independently-loaded Python
modules registering entries under the same type key from different
processes / plugin-host isolation contexts) is exercised only
within a single process; multi-process scenarios remain uncovered.
Refs: apache#554
d880675 to
77b83ff
Compare
…cked entries
Architecture:
- Introduce a dedicated ``tvm_ffi.dataclasses.enum`` module plus two new
public C++ headers that jointly define a single cross-language enum
abstraction. The Python and C++ halves agree on a fixed pair of
TypeAttr columns and on the on-the-wire representation of each
registry, so distributed entry registration (one TU in C++, plus one
or more Python subclasses) converges on the same live containers.
- The C++ half lives in ``include/tvm/ffi/enum.h``: a concrete
``EnumObj`` (fields ``int64_t value`` + ``String name``, both
reflected via ``def_ro``) plus its nullable ObjectRef ``Enum``.
``EnumObj`` is registered under the type key ``ffi.Enum`` and is the
root of every user-defined enum class tree.
- Entry registration is driven by a new builder in
``include/tvm/ffi/reflection/enum_def.h``:
``refl::EnumDef<EnumClsObj>("Name").set_attr("key", value)...``. Each
call allocates a fresh dense ordinal (``= len(entries)``), constructs
the instance, and writes it into a per-type ``__ffi_enum_entries__``
TypeAttr column storing ``Dict<String, Enum>``.
- ``EnumDef`` follows a strict "register-once-then-mutate" protocol:
on the first call per type it registers the mutable ``Dict`` via
``TVMFFITypeRegisterAttr``; subsequent calls look up the existing
``Dict`` via ``TVMFFIGetTypeAttrColumn`` and mutate it in place. The
same protocol governs the per-class attribute store
(``__ffi_enum_attrs__``: ``Dict<String, List<Any>>``), where each
attribute is a list indexed by ordinal, padded with ``None`` through
the ordinal before the write.
- The Python half (``python/tvm_ffi/dataclasses/enum.py``) exposes an
``Enum`` base class registered at the same ``ffi.Enum`` key.
Subclasses declare their FFI type via parameterised inheritance:
``class Foo(Enum, type_key="..."):``. ``Enum.__init_subclass__``
auto-detects whether ``type_key`` is already in the FFI type system
and routes the class through ``@c_class`` (C++-backed) or
``@py_class`` (fresh Python-only) accordingly — no separate
``py_enum``/``c_enum`` opt-in exists.
- Variants are declared in the class body in three shapes, all of
which go through ``__init_subclass__`` scanning and materialise
singletons cached as class attributes (guaranteeing
``Cls.FOO is Cls.FOO``):
1. ``name: ClassVar[Cls]`` (no assignment) — binds to a pre-existing
entry with the same ``name`` from the ``__ffi_enum_entries__``
column (typically registered in C++ via ``refl::EnumDef``), or,
if none exists, registers a blank Python entry.
2. ``name: ClassVar[Cls] = entry(field1=..., field2=...)`` —
registers a Python-side entry and forwards the captured kwargs
to the subclass's ``__init__`` as user-declared fields.
3. ``name = auto()`` / ``name: ClassVar[Cls] = auto()`` —
registers a Python-side entry that carries no user-declared
fields beyond the auto-assigned ``value``/``name``. Semantically
equivalent to ``entry()`` with no arguments, but spelled with a
dedicated helper to keep simple enum bodies uncluttered and to
give users a discoverable alternative to stdlib-style int sugar
(which this module deliberately rejects — see Behavioral
Changes).
- Both ``value`` (dense ordinal in declaration order) and ``name``
(declaration key) are auto-populated on every entry; they are never
user-supplied. ``entry(value=...)`` and ``entry(name=...)`` raise
``TypeError`` at class-body time.
- Per-class class-level reflection surface (``by_name``, ``by_value``,
``attr_dict``) is exposed through a new ``_ClassProperty`` descriptor
— a minimal getter descriptor that receives the owning class, letting
class-level attribute access work without a metaclass.
Public Interfaces:
- New C++ public headers:
* ``include/tvm/ffi/enum.h`` — ``EnumObj``/``Enum`` + two
column-name string constants ``kEnumEntriesAttrName`` (=
``"__ffi_enum_entries__"``) and ``kEnumAttrsAttrName`` (=
``"__ffi_enum_attrs__"``). Both constants are the source of truth
for the column names; Python mirrors them as
``ENUM_ENTRIES_ATTR`` / ``ENUM_ATTRS_ATTR``.
* ``include/tvm/ffi/reflection/enum_def.h`` — ``refl::EnumDef<Obj>``
builder with ``.set_attr(name, value)`` chaining and getters for
``instance()`` / ``ordinal()``.
- ``include/tvm/ffi/tvm_ffi.h`` now transitively includes both new
headers, so consumers of the aggregate header get the enum API
without extra work.
- New Python symbols exported from ``tvm_ffi.dataclasses``:
``Enum``, ``EnumAttrMap``, ``entry``, ``auto``. ``auto`` is a
zero-arg helper returning the same ``_EnumEntry`` sentinel as
``entry()``; it is listed alongside ``entry``/``field``/``Field`` in
``@dataclass_transform(field_specifiers=...)`` so that
``name = auto()`` and ``name: ClassVar[Cls] = auto()`` both
type-check as field-specifier patterns. Class surface on subclasses:
``Cls.get(name)`` / ``Cls.entries()`` / ``Cls.def_attr(name,
*, default=...)`` + three class-level property views ``Cls.by_name``
(``Dict[str, Enum]``), ``Cls.by_value`` (``List[Enum]`` indexed by
ordinal), ``Cls.attr_dict`` (``Dict[str, List[Any]]``).
- ``EnumAttrMap`` round-trips through the unified
``__ffi_enum_attrs__`` column instead of per-attribute columns; a
single Dict column is shared across all attribute names on a given
class, and each value is a List indexed by variant ordinal.
- ``@dataclass_transform(field_specifiers=(Field, field, entry, auto))``
on ``Enum`` lets ``ClassVar[Cls] = entry(...)`` and
``name = auto()`` type-check as proper field-specifier patterns
under typing-aware tools.
- C++ test-support type ``testing.TestEnumVariant`` (in
``src/ffi/testing/testing.cc``) now extends ``EnumObj`` rather than a
bare ``Object``; it registers two entries ``Alpha``/``Beta`` via
``refl::EnumDef`` with a ``code`` attribute. This is the canonical
end-to-end demonstration of the builder.
UI/UX:
- none (library-only change; no CLI, REPL, or user-visible UI surface).
Behavioral Changes:
- ``TVMFFITypeRegisterAttr`` again raises ``RuntimeError`` on duplicate
``(type_index, attr_name)`` writes. This is a reversal of the
previously-relaxed "silent overwrite" behaviour: the invariant that
a TypeAttr slot is registered once is restored and is now load-
bearing for the ``EnumDef`` register-once-then-mutate protocol. The
thrown message tells callers to register a mutable container
(``Dict``/``List``) once and mutate it in place on subsequent calls
— exactly the pattern the enum builders use internally.
- Distributed enum-entry registration — whether across TUs or across
Python subclasses re-binding the same type key — now converges on
shared ``Dict``/``List`` containers. There is no "last writer wins"
semantics; instead, each registrant atomically appends to the shared
live containers, and duplicate ``(type, instance_name)`` collisions
are explicit ``RuntimeError``s at ``EnumDef`` construction.
- Variant declaration forms were narrowed relative to the earlier draft
of this feature. Bare-int sugar (``ok = 0`` auto-promotion), the
simple-integer form ``entry(0)``, and the implicit
``value: int`` field synthesis are no longer supported. ``value``
is always auto-assigned as the dense ordinal and is now a concrete
read-only field on ``EnumObj`` itself. This is a design revision of
work that has not yet shipped to users (the PR is still open), not a
breaking change to a released API.
- Integer-literal sugar (e.g. ``ok = 0``) is deliberately *not*
supported. The auto-ordinal policy owns the ``value`` slot, so a
user-supplied int would either silently duplicate that assignment
or conflict with it; both outcomes are worse than a hard
rejection. ``auto()`` is the intended replacement — it makes the
"no extra fields" intent explicit without fighting the auto-
ordinal contract. The ``entry()`` docstring cross-references
``auto()``, and the ``Enum`` docstring's "Declaration forms"
section calls out this rejection inline so discoverability does
not depend on following ``See Also`` links.
Docs:
- No user-facing RST/Markdown doc page updated in this change. The
``dataclass_reflection.rst`` toctree entries for ``py_class`` /
``c_class`` are still commented out, so no sibling ``Enum`` section
is authored yet. All three declaration forms (bare ``ClassVar``,
``entry(...)``, ``auto()``), the auto-assigned ``value``/``name``
fields, the explicit rejection of integer-literal sugar, and the
``by_name``/``by_value``/``attr_dict`` class-level views are fully
documented in the ``Enum``/``entry``/``auto`` docstrings and the
C++ header Doxygen comments — including the new Doxygen block on
the two-arg ``EnumObj(int64_t value, String name)`` constructor
that was missing before. Follow-up: publish a unified dataclass +
enum reference once the broader dataclass doc lands.
Tests:
- Executed: ``uv run pytest tests/python/test_dataclass_enum.py -q``
— 30/30 passing. Covers all three declaration forms (Python-side
``ClassVar = entry(...)``, bare-``ClassVar[Cls]`` binding, and the
new ``auto()`` helper in both annotated and bare-assignment shapes),
explicit rejection of ``entry(value=...)`` / ``entry(name=...)``,
auto-ordinal assignment, frozen-singleton identity, ``Cls.get`` /
``Cls.entries`` / ``Cls.by_name`` / ``Cls.by_value`` /
``Cls.attr_dict`` surface, ``def_attr`` round-trips through the
unified ``__ffi_enum_attrs__`` column (including missing-default
behaviour, ``__contains__``, cross-enum foreign-variant rejection,
and fresh-wrapper lookup via ``Cls.get``), direct TypeAttr column
verification for both ``__ffi_enum_entries__`` and
``__ffi_enum_attrs__``, the C++-backed happy path against
``testing.TestEnumVariant``'s ``refl::EnumDef``-registered
``Alpha``/``Beta`` entries with their ``code`` attribute, and six
dedicated ``auto()`` tests:
``test_auto_basic_no_annotation``,
``test_auto_with_classvar_annotation``,
``test_auto_mixed_with_bare_classvar`` (verifies the bare-
``ClassVar`` binders come first in annotation order, then sentinel
entries in class-body order, with deterministic dense ordinals),
``test_auto_mixed_with_entry`` (composition with attribute-carrying
``entry(...)`` variants on the same class),
``test_auto_rejects_already_registered_name`` (asserts ``auto()``
is register-not-bind — colliding with a C++-registered entry name
raises), and ``test_auto_returns_fresh_sentinels`` (confirms each
call yields a distinct ``_EnumEntry`` with empty ``args``/``kwargs``).
- Executed: ``pre-commit run --files <all staged>`` — all hooks pass
(ASF header, file types, end-of-file / trailing whitespace, ruff
check/format, ty, clang-format).
- CI-driven operational fixes applied in this amend (no behavior
delta, no test churn):
* Added a ``/*! \brief ... */`` Doxygen block on
``EnumObj(int64_t value, String name)`` in
``include/tvm/ffi/enum.h`` so the doc-build (Doxygen) job no
longer rejects an undocumented public constructor at
``enum.h:71``.
* Annotated the intentional RAII temporary
``refl::ObjectDef<TestEnumVariantObj>(refl::init(false));`` in
``src/ffi/testing/testing.cc`` with a
``// NOLINT(bugprone-unused-raii)`` suppression plus a short
preceding comment explaining that the destructor *is* the
payload (it registers the type), so the clang-tidy job no
longer fires on what is by construction a correct-use pattern.
* Centralised the Windows ``/bigobj`` flag inside the
``tvm_ffi_add_msvc_flags`` macro in
``cmake/Utils/Library.cmake`` so every MSVC target picks it up,
and removed the now-redundant per-target
``target_compile_options(tvm_ffi_objs PRIVATE /bigobj)`` in
``CMakeLists.txt``. This fixes the Windows MSVC
``error C1128: number of sections exceeded object file format
limit`` on ``src/ffi/testing/testing.cc`` (caused by heavy
reflection-template instantiations exceeding the default COFF
section limit) and guards against the same failure recurring on
any other TU that grows past the threshold.
Untested Edge Cases:
- C++ GoogleTest suite (``tests/cpp/``) was not re-run. The C++ delta
touches: (a) restoring ``RegisterAttr``'s duplicate-throw — for which
no existing test asserts either the throw or the silent-overwrite
behaviour being reverted; (b) introducing the ``EnumObj``/``Enum``
root and ``EnumDef`` builder, covered end-to-end by the Python
tests against ``testing.TestEnumVariant``; (c) the test-only
refactor of ``TestEnumVariant`` from a bare ``Object`` to an
``EnumObj`` subclass. Regression risk is low but formally
unverified.
- Rust test suite (``cargo test`` under ``rust/``) was not executed.
No Rust bindings touched; risk is low but untested.
- Cross-module TypeAttr convergence (two independently-loaded Python
modules registering entries under the same type key from different
processes / plugin-host isolation contexts) is exercised only
within a single process; multi-process scenarios remain uncovered.
Refs: apache#554
77b83ff to
18ea1b4
Compare
…cked entries
Architecture:
- Introduce a dedicated ``tvm_ffi.dataclasses.enum`` module plus two new
public C++ headers that jointly define a single cross-language enum
abstraction. The Python and C++ halves agree on a fixed pair of
TypeAttr columns and on the on-the-wire representation of each
registry, so distributed entry registration (one TU in C++, plus one
or more Python subclasses) converges on the same live containers.
- The C++ half lives in ``include/tvm/ffi/enum.h``: a concrete
``EnumObj`` (fields ``int64_t value`` + ``String name``, both
reflected via ``def_ro``) plus its nullable ObjectRef ``Enum``.
``EnumObj`` is registered under the type key ``ffi.Enum`` and is the
root of every user-defined enum class tree.
- Entry registration is driven by a new builder in
``include/tvm/ffi/reflection/enum_def.h``:
``refl::EnumDef<EnumClsObj>("Name").set_attr("key", value)...``. Each
call allocates a fresh dense ordinal (``= len(entries)``), constructs
the instance, and writes it into a per-type ``__ffi_enum_entries__``
TypeAttr column storing ``Dict<String, Enum>``.
- ``EnumDef`` follows a strict "register-once-then-mutate" protocol:
on the first call per type it registers the mutable ``Dict`` via
``TVMFFITypeRegisterAttr``; subsequent calls look up the existing
``Dict`` via ``TVMFFIGetTypeAttrColumn`` and mutate it in place. The
same protocol governs the per-class attribute store
(``__ffi_enum_attrs__``: ``Dict<String, List<Any>>``), where each
attribute is a list indexed by ordinal, padded with ``None`` through
the ordinal before the write.
- The Python half (``python/tvm_ffi/dataclasses/enum.py``) exposes an
``Enum`` base class registered at the same ``ffi.Enum`` key.
Subclasses declare their FFI type via parameterised inheritance:
``class Foo(Enum, type_key="..."):``. ``Enum.__init_subclass__``
auto-detects whether ``type_key`` is already in the FFI type system
and routes the class through ``@c_class`` (C++-backed) or
``@py_class`` (fresh Python-only) accordingly — no separate
``py_enum``/``c_enum`` opt-in exists.
- Variants are declared in the class body in three shapes, all of
which go through ``__init_subclass__`` scanning and materialise
singletons cached as class attributes (guaranteeing
``Cls.FOO is Cls.FOO``):
1. ``name: ClassVar[Cls]`` (no assignment) — binds to a pre-existing
entry with the same ``name`` from the ``__ffi_enum_entries__``
column (typically registered in C++ via ``refl::EnumDef``), or,
if none exists, registers a blank Python entry.
2. ``name: ClassVar[Cls] = entry(field1=..., field2=...)`` —
registers a Python-side entry and forwards the captured kwargs
to the subclass's ``__init__`` as user-declared fields.
3. ``name = auto()`` / ``name: ClassVar[Cls] = auto()`` —
registers a Python-side entry that carries no user-declared
fields beyond the auto-assigned ``value``/``name``. Semantically
equivalent to ``entry()`` with no arguments, but spelled with a
dedicated helper to keep simple enum bodies uncluttered and to
give users a discoverable alternative to stdlib-style int sugar
(which this module deliberately rejects — see Behavioral
Changes).
- Mixed C++/Python entry registration: on a ``type_key`` whose C++
type was registered with ``refl::init(false)`` (i.e. has no
``__ffi_init__``), the Python side still supports ``entry(...)`` /
``auto()`` for fresh Python-side variants. New variants are
allocated via ``__ffi_new__`` (always registered by
``ObjectDef``'s default creator) and populated through the
reflected ``FFIProperty.set`` frozen-setter escape hatch, exactly
mirroring what ``reflection::EnumDef`` does in C++. A bare
``ClassVar[Cls]`` binder, by contrast, still means
"bind-to-existing" and raises a descriptive ``RuntimeError`` when
the named entry is not present in the C++ registry — ordinarily a
typo, but the error message also points the user at
``auto()``/``entry(...)`` if the intent was to add a new variant.
- Both ``value`` (dense ordinal in declaration order) and ``name``
(declaration key) are auto-populated on every entry; they are never
user-supplied. ``entry(value=...)`` and ``entry(name=...)`` raise
``TypeError`` at class-body time.
- Per-class class-level reflection surface (``by_name``, ``by_value``,
``attr_dict``) is exposed through a new ``_ClassProperty`` descriptor
— a minimal getter descriptor that receives the owning class, letting
class-level attribute access work without a metaclass.
- Default repr for every ``EnumObj`` subclass is rendered by
``ReprPrinter`` (in ``src/ffi/extra/dataclass.cc``) as
``<type_key>.<name>`` — the common natural format for enum
variants. The dispatch happens after the user-registered
``__ffi_repr__`` hook check, so explicit per-subclass repr
overrides still take precedence. Complementarily, the built-in
sentinels ``MISSING`` and ``KWARGS`` render as ``<MISSING>`` /
``<KWARGS>`` via a pointer-identity fast path ahead of any
type-keyed lookup, replacing the generic ``ffi.Object`` framing.
Public Interfaces:
- New C++ public headers:
* ``include/tvm/ffi/enum.h`` — ``EnumObj``/``Enum`` + two
column-name string constants ``kEnumEntriesAttrName`` (=
``"__ffi_enum_entries__"``) and ``kEnumAttrsAttrName`` (=
``"__ffi_enum_attrs__"``). Both constants are the source of truth
for the column names; Python mirrors them as
``ENUM_ENTRIES_ATTR`` / ``ENUM_ATTRS_ATTR``.
* ``include/tvm/ffi/reflection/enum_def.h`` — ``refl::EnumDef<Obj>``
builder with ``.set_attr(name, value)`` chaining and getters for
``instance()`` / ``ordinal()``.
- ``include/tvm/ffi/tvm_ffi.h`` now transitively includes both new
headers, so consumers of the aggregate header get the enum API
without extra work.
- New Python symbols exported from ``tvm_ffi.dataclasses``:
``Enum``, ``EnumAttrMap``, ``entry``, ``auto``. ``auto`` is a
zero-arg helper returning the same ``_EnumEntry`` sentinel as
``entry()``; it is listed alongside ``entry``/``field``/``Field`` in
``@dataclass_transform(field_specifiers=...)`` so that
``name = auto()`` and ``name: ClassVar[Cls] = auto()`` both
type-check as field-specifier patterns. Class surface on subclasses:
``Cls.get(name)`` / ``Cls.entries()`` / ``Cls.def_attr(name,
*, default=...)`` + three class-level property views ``Cls.by_name``
(``Dict[str, Enum]``), ``Cls.by_value`` (``List[Enum]`` indexed by
ordinal), ``Cls.attr_dict`` (``Dict[str, List[Any]]``).
- ``EnumAttrMap`` round-trips through the unified
``__ffi_enum_attrs__`` column instead of per-attribute columns; a
single Dict column is shared across all attribute names on a given
class, and each value is a List indexed by variant ordinal.
- ``@dataclass_transform(field_specifiers=(Field, field, entry, auto))``
on ``Enum`` lets ``ClassVar[Cls] = entry(...)`` and
``name = auto()`` type-check as proper field-specifier patterns
under typing-aware tools.
- C++ test-support type ``testing.TestEnumVariant`` (in
``src/ffi/testing/testing.cc``) now extends ``EnumObj`` rather than a
bare ``Object``; it registers two entries ``Alpha``/``Beta`` via
``refl::EnumDef`` with a ``code`` attribute. This is the canonical
end-to-end demonstration of the builder.
UI/UX:
- none (library-only change; no CLI, REPL, or user-visible UI surface).
Behavioral Changes:
- ``TVMFFITypeRegisterAttr`` again raises ``RuntimeError`` on duplicate
``(type_index, attr_name)`` writes. This is a reversal of the
previously-relaxed "silent overwrite" behaviour: the invariant that
a TypeAttr slot is registered once is restored and is now load-
bearing for the ``EnumDef`` register-once-then-mutate protocol. The
thrown message tells callers to register a mutable container
(``Dict``/``List``) once and mutate it in place on subsequent calls
— exactly the pattern the enum builders use internally.
- Distributed enum-entry registration — whether across TUs or across
Python subclasses re-binding the same type key — now converges on
shared ``Dict``/``List`` containers. There is no "last writer wins"
semantics; instead, each registrant atomically appends to the shared
live containers, and duplicate ``(type, instance_name)`` collisions
are explicit ``RuntimeError``s at ``EnumDef`` construction.
- Variant declaration forms were narrowed relative to the earlier draft
of this feature. Bare-int sugar (``ok = 0`` auto-promotion), the
simple-integer form ``entry(0)``, and the implicit
``value: int`` field synthesis are no longer supported. ``value``
is always auto-assigned as the dense ordinal and is now a concrete
read-only field on ``EnumObj`` itself. This is a design revision of
work that has not yet shipped to users (the PR is still open), not a
breaking change to a released API.
- Integer-literal sugar (e.g. ``ok = 0``) is deliberately *not*
supported. The auto-ordinal policy owns the ``value`` slot, so a
user-supplied int would either silently duplicate that assignment
or conflict with it; both outcomes are worse than a hard
rejection. ``auto()`` is the intended replacement — it makes the
"no extra fields" intent explicit without fighting the auto-
ordinal contract. The ``entry()`` docstring cross-references
``auto()``, and the ``Enum`` docstring's "Declaration forms"
section calls out this rejection inline so discoverability does
not depend on following ``See Also`` links.
- ``repr(variant)`` on any ``EnumObj`` subclass now produces
``<type_key>.<name>`` instead of the generic
``type_key(field1=..., field2=...)`` form that ``ReprPrinter``
would otherwise derive from reflected fields. Attribute-carrying
variants still surface their fields via attribute access; they
just no longer leak into the default repr, which is much easier
to read in REPL output and in nested containers like
``by_name``/``by_value``.
- ``repr(core.MISSING)`` and ``repr(core.KWARGS)`` now render as
``<MISSING>`` / ``<KWARGS>`` (angle-bracket sentinel convention)
rather than the generic ``ffi.Object`` fallback. Pointer-identity
dispatch keeps the change free of type-lookup overhead.
- Bare ``ClassVar[Cls]`` binders on a cxx-backed enum that name no
existing C++ entry now raise a descriptive ``RuntimeError`` listing
the known entries and the ``ClassVar`` / ``auto()`` / ``entry(...)``
syntax, instead of falling through MRO to the ``Enum`` base's
``init=False`` ``TypeError`` guard with its misleading "cannot be
constructed directly" message.
- A cxx-backed enum (``type_key`` already registered in C++) now
accepts mixed registration: bare ``ClassVar[Cls]`` binders bind to
existing C++ entries, while ``entry(...)`` / ``auto()`` sentinels
register *new* Python-side entries whose ordinals continue the
dense sequence past the C++ count. Previously only bare-binding
was allowed; new Python entries were rejected.
Docs:
- No user-facing RST/Markdown doc page updated in this change. The
``dataclass_reflection.rst`` toctree entries for ``py_class`` /
``c_class`` are still commented out, so no sibling ``Enum`` section
is authored yet. All three declaration forms (bare ``ClassVar``,
``entry(...)``, ``auto()``), the auto-assigned ``value``/``name``
fields, the explicit rejection of integer-literal sugar, and the
``by_name``/``by_value``/``attr_dict`` class-level views are fully
documented in the ``Enum``/``entry``/``auto`` docstrings and the
C++ header Doxygen comments — including the new Doxygen block on
the two-arg ``EnumObj(int64_t value, String name)`` constructor
that was missing before. Follow-up: publish a unified dataclass +
enum reference once the broader dataclass doc lands.
Tests:
- Executed: ``uv run pytest tests/python/test_dataclass_enum.py -q``
— 38/38 passing. Covers all three declaration forms (Python-side
``ClassVar = entry(...)``, bare-``ClassVar[Cls]`` binding, and the
``auto()`` helper in both annotated and bare-assignment shapes),
explicit rejection of ``entry(value=...)`` / ``entry(name=...)``,
auto-ordinal assignment, frozen-singleton identity, ``Cls.get`` /
``Cls.entries`` / ``Cls.by_name`` / ``Cls.by_value`` /
``Cls.attr_dict`` surface, ``def_attr`` round-trips through the
unified ``__ffi_enum_attrs__`` column (including missing-default
behaviour, ``__contains__``, cross-enum foreign-variant rejection,
and fresh-wrapper lookup via ``Cls.get``), direct TypeAttr column
verification for both ``__ffi_enum_entries__`` and
``__ffi_enum_attrs__``, the C++-backed happy path against
``testing.TestEnumVariant``'s ``refl::EnumDef``-registered
``Alpha``/``Beta`` entries with their ``code`` attribute, six
dedicated ``auto()`` tests
(``test_auto_basic_no_annotation``,
``test_auto_with_classvar_annotation``,
``test_auto_mixed_with_bare_classvar`` — bare ``ClassVar`` binders
come first in annotation order, then sentinel entries in class-body
order, with deterministic dense ordinals;
``test_auto_mixed_with_entry`` — composition with attribute-carrying
``entry(...)`` variants on the same class;
``test_auto_rejects_already_registered_name`` — asserts ``auto()``
is register-not-bind and colliding with a C++-registered entry
name raises; ``test_auto_returns_fresh_sentinels`` — each call
yields a distinct ``_EnumEntry`` with empty ``args``/``kwargs``),
and seven new tests covering the repr and mixed-registration
behaviour: ``test_default_repr_python_backed``,
``test_default_repr_cxx_backed``,
``test_default_repr_in_nested_container`` (Dict/List recursion
through ``by_name`` / ``by_value``),
``test_default_repr_with_attribute_carrying_variant``,
``test_missing_and_kwargs_sentinel_repr``,
``test_cxx_backed_binder_typo_raises_descriptive_error`` (asserts
the error names the unknown entry, the type key, the known C++
entries, and the ``ClassVar`` syntax),
``test_cxx_backed_mixed_entries_via_auto`` (bare ``ClassVar``
binders coexist with ``auto()`` entries on
``testing.TestEnumVariant``; new ordinals extend past the C++
count), and ``test_cxx_backed_python_entry_accepts_def_attr``
(``def_attr`` writes widen the attrs column to cover new
Python-side ordinals while C++ entries retain defaults).
- Executed: ``pre-commit run --files <all staged>`` — all hooks pass
(ASF header, file types, end-of-file / trailing whitespace, ruff
check/format, ty, clang-format).
- Executed: full Python suite ``uv run pytest tests/python -q`` —
2246 passed, 16 skipped, 3 xfailed. No regressions from the repr
change or the mixed-registration path.
- CI-driven operational fixes applied in this amend (no behavior
delta, no test churn):
* Added a ``/*\! \brief ... */`` Doxygen block on
``EnumObj(int64_t value, String name)`` in
``include/tvm/ffi/enum.h`` so the doc-build (Doxygen) job no
longer rejects an undocumented public constructor at
``enum.h:71``.
* Annotated the intentional RAII temporary
``refl::ObjectDef<TestEnumVariantObj>(refl::init(false));`` in
``src/ffi/testing/testing.cc`` with a
``// NOLINT(bugprone-unused-raii)`` suppression plus a short
preceding comment explaining that the destructor *is* the
payload (it registers the type), so the clang-tidy job no
longer fires on what is by construction a correct-use pattern.
* Centralised the Windows ``/bigobj`` flag inside the
``tvm_ffi_add_msvc_flags`` macro in
``cmake/Utils/Library.cmake`` so every MSVC target picks it up,
and removed the now-redundant per-target
``target_compile_options(tvm_ffi_objs PRIVATE /bigobj)`` in
``CMakeLists.txt``. This fixes the Windows MSVC
``error C1128: number of sections exceeded object file format
limit`` on ``src/ffi/testing/testing.cc`` (caused by heavy
reflection-template instantiations exceeding the default COFF
section limit) and guards against the same failure recurring on
any other TU that grows past the threshold.
Untested Edge Cases:
- C++ GoogleTest suite (``tests/cpp/``) was not re-run. The C++ delta
touches: (a) restoring ``RegisterAttr``'s duplicate-throw — for which
no existing test asserts either the throw or the silent-overwrite
behaviour being reverted; (b) introducing the ``EnumObj``/``Enum``
root and ``EnumDef`` builder, covered end-to-end by the Python
tests against ``testing.TestEnumVariant``; (c) the test-only
refactor of ``TestEnumVariant`` from a bare ``Object`` to an
``EnumObj`` subclass; (d) the ``ReprPrinter`` additions for
``EnumObj`` subclasses and MISSING/KWARGS — verified via the
Python suite but not from a C++ GoogleTest. Regression risk is
low but formally unverified.
- Rust test suite (``cargo test`` under ``rust/``) was not executed.
No Rust bindings touched; risk is low but untested.
- Cross-module TypeAttr convergence (two independently-loaded Python
modules registering entries under the same type key from different
processes / plugin-host isolation contexts) is exercised only
within a single process; multi-process scenarios remain uncovered.
Refs: apache#554
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
18ea1b4 to
7d15159
Compare
…cked entries
Architecture:
- Introduce a dedicated ``tvm_ffi.dataclasses.enum`` module plus two new
public C++ headers that jointly define a single cross-language enum
abstraction. The Python and C++ halves agree on a fixed pair of
TypeAttr columns and on the on-the-wire representation of each
registry, so distributed entry registration (one TU in C++, plus one
or more Python subclasses) converges on the same live containers.
- The C++ half lives in ``include/tvm/ffi/enum.h``: a concrete
``EnumObj`` (fields ``int64_t value`` + ``String name``, both
reflected via ``def_ro``) plus its nullable ObjectRef ``Enum``.
``EnumObj`` is registered under the type key ``ffi.Enum`` and is the
root of every user-defined enum class tree.
- Entry registration is driven by a new builder in
``include/tvm/ffi/reflection/enum_def.h``:
``refl::EnumDef<EnumClsObj>("Name").set_attr("key", value)...``. Each
call allocates a fresh dense ordinal (``= len(entries)``), constructs
the instance, and writes it into a per-type ``__ffi_enum_entries__``
TypeAttr column storing ``Dict<String, Enum>``.
- ``EnumDef`` follows a strict "register-once-then-mutate" protocol:
on the first call per type it registers the mutable ``Dict`` via
``TVMFFITypeRegisterAttr``; subsequent calls look up the existing
``Dict`` via ``TVMFFIGetTypeAttrColumn`` and mutate it in place. The
same protocol governs the per-class attribute store
(``__ffi_enum_attrs__``: ``Dict<String, List<Any>>``), where each
attribute is a list indexed by ordinal, padded with ``None`` through
the ordinal before the write.
- The Python half (``python/tvm_ffi/dataclasses/enum.py``) exposes an
``Enum`` base class registered at the same ``ffi.Enum`` key.
Subclasses declare their FFI type via parameterised inheritance:
``class Foo(Enum, type_key="..."):``. ``Enum.__init_subclass__``
auto-detects whether ``type_key`` is already in the FFI type system
and routes the class through ``@c_class`` (C++-backed) or
``@py_class`` (fresh Python-only) accordingly — no separate
``py_enum``/``c_enum`` opt-in exists.
- Variants are declared in the class body in three shapes, all of
which go through ``__init_subclass__`` scanning and materialise
singletons cached as class attributes (guaranteeing
``Cls.FOO is Cls.FOO``):
1. ``name: ClassVar[Cls]`` (no assignment) — binds to a pre-existing
entry with the same ``name`` from the ``__ffi_enum_entries__``
column (typically registered in C++ via ``refl::EnumDef``), or,
if none exists, registers a blank Python entry.
2. ``name: ClassVar[Cls] = entry(field1=..., field2=...)`` —
registers a Python-side entry and forwards the captured kwargs
to the subclass's ``__init__`` as user-declared fields.
3. ``name = auto()`` / ``name: ClassVar[Cls] = auto()`` —
registers a Python-side entry that carries no user-declared
fields beyond the auto-assigned ``value``/``name``. Semantically
equivalent to ``entry()`` with no arguments, but spelled with a
dedicated helper to keep simple enum bodies uncluttered and to
give users a discoverable alternative to stdlib-style int sugar
(which this module deliberately rejects — see Behavioral
Changes).
- Mixed C++/Python entry registration: on a ``type_key`` whose C++
type was registered with ``refl::init(false)`` (i.e. has no
``__ffi_init__``), the Python side still supports ``entry(...)`` /
``auto()`` for fresh Python-side variants. New variants are
allocated via ``__ffi_new__`` (always registered by
``ObjectDef``'s default creator) and populated through the
reflected ``FFIProperty.set`` frozen-setter escape hatch, exactly
mirroring what ``reflection::EnumDef`` does in C++. A bare
``ClassVar[Cls]`` binder, by contrast, still means
"bind-to-existing" and raises a descriptive ``RuntimeError`` when
the named entry is not present in the C++ registry — ordinarily a
typo, but the error message also points the user at
``auto()``/``entry(...)`` if the intent was to add a new variant.
- Both ``value`` (dense ordinal in declaration order) and ``name``
(declaration key) are auto-populated on every entry; they are never
user-supplied. ``entry(value=...)`` and ``entry(name=...)`` raise
``TypeError`` at class-body time.
- Per-class class-level reflection surface (``by_name``, ``by_value``,
``attr_dict``) is exposed through a new ``_ClassProperty`` descriptor
— a minimal getter descriptor that receives the owning class, letting
class-level attribute access work without a metaclass.
- Default repr for every ``EnumObj`` subclass is rendered by
``ReprPrinter`` (in ``src/ffi/extra/dataclass.cc``) as
``<type_key>.<name>`` — the common natural format for enum
variants. The dispatch happens after the user-registered
``__ffi_repr__`` hook check, so explicit per-subclass repr
overrides still take precedence. Complementarily, the built-in
sentinels ``MISSING`` and ``KWARGS`` render as ``<MISSING>`` /
``<KWARGS>`` via a pointer-identity fast path ahead of any
type-keyed lookup, replacing the generic ``ffi.Object`` framing.
Public Interfaces:
- New C++ public headers:
* ``include/tvm/ffi/enum.h`` — ``EnumObj``/``Enum`` + two
column-name string constants ``kEnumEntriesAttrName`` (=
``"__ffi_enum_entries__"``) and ``kEnumAttrsAttrName`` (=
``"__ffi_enum_attrs__"``). Both constants are the source of truth
for the column names; Python mirrors them as
``ENUM_ENTRIES_ATTR`` / ``ENUM_ATTRS_ATTR``.
* ``include/tvm/ffi/reflection/enum_def.h`` — ``refl::EnumDef<Obj>``
builder with ``.set_attr(name, value)`` chaining and getters for
``instance()`` / ``ordinal()``.
- ``include/tvm/ffi/tvm_ffi.h`` now transitively includes both new
headers, so consumers of the aggregate header get the enum API
without extra work.
- New Python symbols exported from ``tvm_ffi.dataclasses``:
``Enum``, ``EnumAttrMap``, ``entry``, ``auto``. ``auto`` is a
zero-arg helper returning the same ``_EnumEntry`` sentinel as
``entry()``; it is listed alongside ``entry``/``field``/``Field`` in
``@dataclass_transform(field_specifiers=...)`` so that
``name = auto()`` and ``name: ClassVar[Cls] = auto()`` both
type-check as field-specifier patterns. Class surface on subclasses:
``Cls.get(name)`` / ``Cls.entries()`` / ``Cls.def_attr(name,
*, default=...)`` + three class-level property views ``Cls.by_name``
(``Dict[str, Enum]``), ``Cls.by_value`` (``List[Enum]`` indexed by
ordinal), ``Cls.attr_dict`` (``Dict[str, List[Any]]``).
- ``EnumAttrMap`` round-trips through the unified
``__ffi_enum_attrs__`` column instead of per-attribute columns; a
single Dict column is shared across all attribute names on a given
class, and each value is a List indexed by variant ordinal.
- ``@dataclass_transform(field_specifiers=(Field, field, entry, auto))``
on ``Enum`` lets ``ClassVar[Cls] = entry(...)`` and
``name = auto()`` type-check as proper field-specifier patterns
under typing-aware tools.
- C++ test-support type ``testing.TestEnumVariant`` (in
``src/ffi/testing/testing.cc``) now extends ``EnumObj`` rather than a
bare ``Object``; it registers two entries ``Alpha``/``Beta`` via
``refl::EnumDef`` with a ``code`` attribute. This is the canonical
end-to-end demonstration of the builder.
UI/UX:
- none (library-only change; no CLI, REPL, or user-visible UI surface).
Behavioral Changes:
- ``TVMFFITypeRegisterAttr`` again raises ``RuntimeError`` on duplicate
``(type_index, attr_name)`` writes. This is a reversal of the
previously-relaxed "silent overwrite" behaviour: the invariant that
a TypeAttr slot is registered once is restored and is now load-
bearing for the ``EnumDef`` register-once-then-mutate protocol. The
thrown message tells callers to register a mutable container
(``Dict``/``List``) once and mutate it in place on subsequent calls
— exactly the pattern the enum builders use internally.
- Distributed enum-entry registration — whether across TUs or across
Python subclasses re-binding the same type key — now converges on
shared ``Dict``/``List`` containers. There is no "last writer wins"
semantics; instead, each registrant atomically appends to the shared
live containers, and duplicate ``(type, instance_name)`` collisions
are explicit ``RuntimeError``s at ``EnumDef`` construction.
- Variant declaration forms were narrowed relative to the earlier draft
of this feature. Bare-int sugar (``ok = 0`` auto-promotion), the
simple-integer form ``entry(0)``, and the implicit
``value: int`` field synthesis are no longer supported. ``value``
is always auto-assigned as the dense ordinal and is now a concrete
read-only field on ``EnumObj`` itself. This is a design revision of
work that has not yet shipped to users (the PR is still open), not a
breaking change to a released API.
- Integer-literal sugar (e.g. ``ok = 0``) is deliberately *not*
supported. The auto-ordinal policy owns the ``value`` slot, so a
user-supplied int would either silently duplicate that assignment
or conflict with it; both outcomes are worse than a hard
rejection. ``auto()`` is the intended replacement — it makes the
"no extra fields" intent explicit without fighting the auto-
ordinal contract. The ``entry()`` docstring cross-references
``auto()``, and the ``Enum`` docstring's "Declaration forms"
section calls out this rejection inline so discoverability does
not depend on following ``See Also`` links.
- ``repr(variant)`` on any ``EnumObj`` subclass now produces
``<type_key>.<name>`` instead of the generic
``type_key(field1=..., field2=...)`` form that ``ReprPrinter``
would otherwise derive from reflected fields. Attribute-carrying
variants still surface their fields via attribute access; they
just no longer leak into the default repr, which is much easier
to read in REPL output and in nested containers like
``by_name``/``by_value``.
- ``repr(core.MISSING)`` and ``repr(core.KWARGS)`` now render as
``<MISSING>`` / ``<KWARGS>`` (angle-bracket sentinel convention)
rather than the generic ``ffi.Object`` fallback. Pointer-identity
dispatch keeps the change free of type-lookup overhead.
- Bare ``ClassVar[Cls]`` binders on a cxx-backed enum that name no
existing C++ entry now raise a descriptive ``RuntimeError`` listing
the known entries and the ``ClassVar`` / ``auto()`` / ``entry(...)``
syntax, instead of falling through MRO to the ``Enum`` base's
``init=False`` ``TypeError`` guard with its misleading "cannot be
constructed directly" message.
- A cxx-backed enum (``type_key`` already registered in C++) now
accepts mixed registration: bare ``ClassVar[Cls]`` binders bind to
existing C++ entries, while ``entry(...)`` / ``auto()`` sentinels
register *new* Python-side entries whose ordinals continue the
dense sequence past the C++ count. Previously only bare-binding
was allowed; new Python entries were rejected.
Docs:
- No user-facing RST/Markdown doc page updated in this change. The
``dataclass_reflection.rst`` toctree entries for ``py_class`` /
``c_class`` are still commented out, so no sibling ``Enum`` section
is authored yet. All three declaration forms (bare ``ClassVar``,
``entry(...)``, ``auto()``), the auto-assigned ``value``/``name``
fields, the explicit rejection of integer-literal sugar, and the
``by_name``/``by_value``/``attr_dict`` class-level views are fully
documented in the ``Enum``/``entry``/``auto`` docstrings and the
C++ header Doxygen comments — including the new Doxygen block on
the two-arg ``EnumObj(int64_t value, String name)`` constructor
that was missing before. Follow-up: publish a unified dataclass +
enum reference once the broader dataclass doc lands.
Tests:
- Executed: ``uv run pytest tests/python/test_dataclass_enum.py -q``
— 38/38 passing. Covers all three declaration forms (Python-side
``ClassVar = entry(...)``, bare-``ClassVar[Cls]`` binding, and the
``auto()`` helper in both annotated and bare-assignment shapes),
explicit rejection of ``entry(value=...)`` / ``entry(name=...)``,
auto-ordinal assignment, frozen-singleton identity, ``Cls.get`` /
``Cls.entries`` / ``Cls.by_name`` / ``Cls.by_value`` /
``Cls.attr_dict`` surface, ``def_attr`` round-trips through the
unified ``__ffi_enum_attrs__`` column (including missing-default
behaviour, ``__contains__``, cross-enum foreign-variant rejection,
and fresh-wrapper lookup via ``Cls.get``), direct TypeAttr column
verification for both ``__ffi_enum_entries__`` and
``__ffi_enum_attrs__``, the C++-backed happy path against
``testing.TestEnumVariant``'s ``refl::EnumDef``-registered
``Alpha``/``Beta`` entries with their ``code`` attribute, six
dedicated ``auto()`` tests
(``test_auto_basic_no_annotation``,
``test_auto_with_classvar_annotation``,
``test_auto_mixed_with_bare_classvar`` — bare ``ClassVar`` binders
come first in annotation order, then sentinel entries in class-body
order, with deterministic dense ordinals;
``test_auto_mixed_with_entry`` — composition with attribute-carrying
``entry(...)`` variants on the same class;
``test_auto_rejects_already_registered_name`` — asserts ``auto()``
is register-not-bind and colliding with a C++-registered entry
name raises; ``test_auto_returns_fresh_sentinels`` — each call
yields a distinct ``_EnumEntry`` with empty ``args``/``kwargs``),
and seven new tests covering the repr and mixed-registration
behaviour: ``test_default_repr_python_backed``,
``test_default_repr_cxx_backed``,
``test_default_repr_in_nested_container`` (Dict/List recursion
through ``by_name`` / ``by_value``),
``test_default_repr_with_attribute_carrying_variant``,
``test_missing_and_kwargs_sentinel_repr``,
``test_cxx_backed_binder_typo_raises_descriptive_error`` (asserts
the error names the unknown entry, the type key, the known C++
entries, and the ``ClassVar`` syntax),
``test_cxx_backed_mixed_entries_via_auto`` (bare ``ClassVar``
binders coexist with ``auto()`` entries on
``testing.TestEnumVariant``; new ordinals extend past the C++
count), and ``test_cxx_backed_python_entry_accepts_def_attr``
(``def_attr`` writes widen the attrs column to cover new
Python-side ordinals while C++ entries retain defaults).
- Executed: ``pre-commit run --files <all staged>`` — all hooks pass
(ASF header, file types, end-of-file / trailing whitespace, ruff
check/format, ty, clang-format).
- Executed: full Python suite ``uv run pytest tests/python -q`` —
2246 passed, 16 skipped, 3 xfailed. No regressions from the repr
change or the mixed-registration path.
- CI-driven operational fixes applied in this amend (no behavior
delta, no test churn):
* Added a ``/*\! \brief ... */`` Doxygen block on
``EnumObj(int64_t value, String name)`` in
``include/tvm/ffi/enum.h`` so the doc-build (Doxygen) job no
longer rejects an undocumented public constructor at
``enum.h:71``.
* Annotated the intentional RAII temporary
``refl::ObjectDef<TestEnumVariantObj>(refl::init(false));`` in
``src/ffi/testing/testing.cc`` with a
``// NOLINT(bugprone-unused-raii)`` suppression plus a short
preceding comment explaining that the destructor *is* the
payload (it registers the type), so the clang-tidy job no
longer fires on what is by construction a correct-use pattern.
* Centralised the Windows ``/bigobj`` flag inside the
``tvm_ffi_add_msvc_flags`` macro in
``cmake/Utils/Library.cmake`` so every MSVC target picks it up,
and removed the now-redundant per-target
``target_compile_options(tvm_ffi_objs PRIVATE /bigobj)`` in
``CMakeLists.txt``. This fixes the Windows MSVC
``error C1128: number of sections exceeded object file format
limit`` on ``src/ffi/testing/testing.cc`` (caused by heavy
reflection-template instantiations exceeding the default COFF
section limit) and guards against the same failure recurring on
any other TU that grows past the threshold.
Untested Edge Cases:
- C++ GoogleTest suite (``tests/cpp/``) was not re-run. The C++ delta
touches: (a) restoring ``RegisterAttr``'s duplicate-throw — for which
no existing test asserts either the throw or the silent-overwrite
behaviour being reverted; (b) introducing the ``EnumObj``/``Enum``
root and ``EnumDef`` builder, covered end-to-end by the Python
tests against ``testing.TestEnumVariant``; (c) the test-only
refactor of ``TestEnumVariant`` from a bare ``Object`` to an
``EnumObj`` subclass; (d) the ``ReprPrinter`` additions for
``EnumObj`` subclasses and MISSING/KWARGS — verified via the
Python suite but not from a C++ GoogleTest. Regression risk is
low but formally unverified.
- Rust test suite (``cargo test`` under ``rust/``) was not executed.
No Rust bindings touched; risk is low but untested.
- Cross-module TypeAttr convergence (two independently-loaded Python
modules registering entries under the same type key from different
processes / plugin-host isolation contexts) is exercised only
within a single process; multi-process scenarios remain uncovered.
Refs: apache#554
9cd16d7 to
9339fe1
Compare
9339fe1 to
b89c2f9
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces cross-language enum support to the TVM FFI, providing C++ base classes, registration utilities, and a Python Enum dataclass for shared registries. The review feedback identifies a critical memory management bug in the Cython bindings where an incorrect DecRef call on a non-owning view could lead to memory corruption. Additionally, the feedback points out a misleading docstring regarding attribute registration, suggests optimizing the entries() iterator by leveraging the by_value property, and notes a potential inconsistency in EnumAttrMap's membership checks when handling None values.
| if temp.type_index >= kTVMFFIStaticObjectBegin and temp.v_obj != NULL: | ||
| TVMFFIObjectDecRef(<TVMFFIObjectHandle>temp.v_obj) |
There was a problem hiding this comment.
This TVMFFIObjectDecRef call is likely to cause memory corruption or use-after-free. TVMFFIPyPyObjectToFFIAny with TVMFFIPyArgSetterFactory_ produces a non-owning view of the object (it does not increment the reference count). Calling DecRef in the finally block will decrement the reference count of the original object passed as value, potentially destroying it prematurely. The CAny.__init__ implementation in the same file correctly uses TVMFFIAnyViewToOwnedAny to acquire ownership from such a view without a corresponding DecRef on the view itself.
# Remove the DecRef call as temp is a non-owning view
pass
- `_register_type_attr`: drop the spurious `TVMFFIObjectDecRef` in the `finally` block. `TVMFFIPyPyObjectToFFIAny` produces a non-owning view; `TVMFFITypeRegisterAttr` stores via an `AnyView`-to-`Any` assignment that incref's internally, so no caller-side refcount management is needed. Update the docstring to reflect the new "raises on duplicate" behavior of the underlying C++ registrar. - `Enum.entries`: simplify to `iter(cls.by_value)` (already ordinal-indexed) instead of a sort-on-every-call. - `EnumAttrMap.__setitem__`: reject `None` with `TypeError` — `None` is the column's "unset" sentinel (matching C++ `EnumDef::set_attr` `Any(nullptr)` padding), so an explicit `attr[variant] = None` would be indistinguishable from unset and silently break `__contains__` / `__getitem__`. Document the restriction in `def_attr` and `EnumAttrMap`. - Add regression test for the `None`-rejection behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Thanks for the review! Addressed all four in 1. 2. 3. 4. All 2247 existing Python tests still pass. |
RFC: #553
Add first-class cross-language enum support to TVM-FFI. An enum is a registered Object type whose instances are named, frozen singletons — the same model as
tvm::Op, generalised into anEnumbase class usable from Python and C++ and converging on a single shared registry pertype_key.At a glance
Python
C++
The Python and C++ halves write to the same two
type_index-keyed TypeAttr columns, so a Python subclass that bindstype_key="nn.Activation"sees every C++-registered entry, and any laterauto()/entry(...)from Python becomes visible to C++ readers of the same columns. Entries cross FFI as ordinaryObjectRefs — no wire-format work.Design
Enum instances are
EnumObjsubclasses. Each carries a dense auto-assignedint64_t value(0-indexed per class, declaration-order ordinal) and aString name. Both are populated at registration; neither is user-supplied.Two per-class TypeAttr columns, shared across all call sites:
__ffi_enum_entries__—Dict<String, Enum>mapping instance name → frozen singleton.__ffi_enum_attrs__—Dict<String, List<Any>>mapping attribute name → ordinal-indexed list.Register-once-then-mutate. Each column is registered exactly once via
TVMFFITypeRegisterAttr; every subsequent writer fetches the live container withTVMFFIGetTypeAttrColumnand mutates it in place. Distributed registration across TUs or Python modules converges on one set of containers.Python variants are declared in one of four shapes, processed in
Enum.__init_subclass__:name: ClassVar[Cls] = entry(**kwargs)— registers a Python-side entry and forwards kwargs to__init__.name = entry(**kwargs)(no annotation) — same as 1, for attribute-carrying enums whereClassVaris noise.name = auto()(orname: ClassVar[Cls] = auto()) — registers a variant with no extra fields; the preferred form for simple enums.name: ClassVar[Cls]— binds to a C++-registered entry of the same name, or registers a blank Python entry if none exists.Within one class body, bare
ClassVarbinders resolve first (annotation order), then sentinel assignments (class-body order); auto-ordinals follow that combined order. Mixing all four forms on a single class is supported.Auto-detected backend.
Enum.__init_subclass__(type_key=...)routes the subclass through@c_classif the type is already registered in the FFI type system, otherwise through@py_class. There is no separatepy_enum/c_enumopt-in.Integer literals are rejected on the RHS. The auto-ordinal policy owns
value, sook = 0andentry(0)would either duplicate or conflict with the auto-ordinal.auto()is the intended replacement.entry(value=...)/entry(name=...)raiseTypeErrorat class-body time.New public interfaces
C++ headers
include/tvm/ffi/enum.h—EnumObj(int64_t value,String name, bothdef_ro-reflected) andEnum(nullableObjectRefwrapper), registered under type keyffi.Enum. Plus two column-name constantskEnumEntriesAttrName(="__ffi_enum_entries__") andkEnumAttrsAttrName(="__ffi_enum_attrs__").include/tvm/ffi/reflection/enum_def.h—refl::EnumDef<T>("name").set_attr("key", value).... Each call allocates a fresh ordinal, constructs the instance, and writes it into the per-class registry. Duplicate names for the sameTraiseRuntimeError. Exposes.instance()/.ordinal()for tests / advanced callers.include/tvm/ffi/tvm_ffi.htransitively includes both new headers.Python surface (
tvm_ffi.dataclasses)Enum— base class, decorated@dataclass_transform(field_specifiers=(Field, field, entry, auto))so type checkers recogniseentry()/auto()as dataclass-field specifiers.entry(**kwargs),auto()— variant-declaration sentinels.EnumAttrMap— view over the shared__ffi_enum_attrs__column;__getitem__/__setitem__/__contains__/get(default=...).Cls.get(name),Cls.entries(),Cls.def_attr(name, *, default=...), and three live class-level propertiesCls.by_name(Dict[str, Enum]),Cls.by_value(List[Enum]indexed by ordinal),Cls.attr_dict(Dict[str, List[Any]]). The class-level properties are backed by an internal_ClassPropertydescriptor so they work without a metaclass.Other user-visible changes
TVMFFITypeRegisterAttrrejects duplicate(type_index, attr_name)writes. Reverses a previously relaxed "silent overwrite" behaviour. The enforced invariant is load-bearing for the register-once-then-mutate protocol; the error message points callers at that protocol.EnumObjsubclasses is<type_key>.<name>instead of the generictype_key(field1=..., field2=...)form. Rendered byReprPrinterafter the__ffi_repr__hook check, so explicit overrides still take precedence.MISSING/KWARGSnow render as<MISSING>/<KWARGS>via pointer-identity dispatch, replacing the genericffi.Objectfallback.testing.TestEnumVariant(insrc/ffi/testing/testing.cc) now extendsEnumObjand registersAlpha/Betaentries with acodeattribute viarefl::EnumDef. This is the canonical end-to-end demonstration of the builder and is exercised by the Python test suite.Testing
uv run pytest tests/python/test_dataclass_enum.py -q— 38/38 passing. Covers all four declaration forms, auto-ordinal assignment, frozen-singleton identity, rejection ofentry(value=...)/entry(name=...),get/entries/by_name/by_value/attr_dict,def_attrround-trips through the unified column, direct TypeAttr verification, the C++-backed happy path againsttesting.TestEnumVariant, mixed C++/Python entry registration, and the repr / sentinel behaviour.uv run pytest tests/python -q— 2246 passed, 16 skipped, 3 xfailed. No regressions.pre-commit run --all-files— clean.C++ GoogleTest and Rust suites were not re-run; the enum builder is exercised end-to-end from the Python tests against
testing.TestEnumVariant, and no Rust bindings were touched.