Skip to content

Replace regex Builder with JSON codegen and bump bindings to current MEOS#3

Open
estebanzimanyi wants to merge 9 commits into
mainfrom
bump/meos-1.3-via-meos-idl
Open

Replace regex Builder with JSON codegen and bump bindings to current MEOS#3
estebanzimanyi wants to merge 9 commits into
mainfrom
bump/meos-1.3-via-meos-idl

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

@estebanzimanyi estebanzimanyi commented May 13, 2026

Replaces the regex-based MEOS.NET.Builder with a JSON-driven tools/codegen.py that consumes MobilityDB/MEOS-API's unified meos-idl.json catalog and emits MEOSExternalFunctions.cs + MEOSExposedFunctions.cs, regenerating the full current MEOS surface (rtree builders, type-specific tfloatbox/tintbox helpers, span_to_tbox / set_to_tbox / spanset_to_tbox, tdistance_tfloat_float / tdistance_tnumber_tnumber, the ever/always family, temporal_round / temporal_derivative, numspan_width / numspanset_width, and the geo helpers). MEOS.NET.Builder and its solution entry are removed. codegen.py takes a --dll-path option defaulting to the bare name "meos" so the OS loader resolves it via LD_LIBRARY_PATH / DYLD_LIBRARY_PATH / PATH rather than a hardcoded developer path. The hand-written wrappers under MEOS.NET/Types/ are adapted to the regenerated shape at the call sites: the int/bool boundary (the bindings marshal C _Bool as C# bool via UnmanagedType.U1, so predicates no longer wrap calls in != 0 and the lower_inc / upper_inc / bounding-box / ignoreGaps flags pass as bool), ulong size_t on the *_from_wkb size args, 64-bit timestamp parameters, meos_initialize split into meos_initialize + meos_initialize_timezone + meos_initialize_error_handler with the C# error-handler delegate marshaled via Marshal.GetFunctionPointerForDelegate and held in a static field so the GC does not collect it under MEOS, and the generic Temporal.FromMFJson(string) replaced by typed factories on TemporalBoolean / TemporalFloat that call tbool_from_mfjson / tfloat_from_mfjson directly. A CI workflow regenerates the bindings from meos-idl.json, builds the solution and runs ExampleApp against a freshly built libmeos so a future MEOS surface change fails the build instead of drifting silently; it stacks on MEOS-API PR #1 (the postgres integer-typedef stub that keeps int64/TimestampTz 64-bit) and PR #2 (shape metadata) until both land on MEOS-API master, after which MEOS_API_REF reverts to master. Solution builds 0-error / 0-warning across MEOS.NET, MEOS.NET.NpgSql, ExampleApp and MEOS.NET.Tests, and ExampleApp round-trips against libmeos.

Adds tools/codegen.py: consumes MEOS-API's meos-idl.json
(github.com/MobilityDB/MEOS-API) and emits MEOSExternalFunctions.cs
and MEOSExposedFunctions.cs.  Replaces the regex-based
MEOS.NET.Builder which had known defects: `int32_t srid` rendered
as `int_t srid`, throws on empty argument lists, hardcoded
developer DllPath, single-line-only regex.

The regenerated bindings cover the full MEOS 1.3 public surface:
2397 functions (was 1281, +1116).  DllPath is the bare name
"meos" so the OS loader resolves via LD_LIBRARY_PATH /
DYLD_LIBRARY_PATH / PATH — no hardcoded developer paths.

The high-level C# wrappers under MEOS.NET/Types/ are hand-written
and call into these bindings; the MEOS 1.3 surface adds, renames,
and re-types ~1100 names.  `dotnet build MEOS.NET/MEOS.NET.csproj`
against the new bindings surfaces every adaptation site as a
compiler error: 137 to work through (predicates that flipped from
bool to int, renamed helpers like tfloat_round / tfloat_derivative
that now live under tnumber_* or have type-suffixed names).  Those
follow-ups stay outside this commit so the binding refresh is
mechanically reviewable on its own.

CFunctionDeclaration.HasUndefinedElements no longer rejects
declarations with empty argument lists (valid C, e.g. `int foo()`)
so the legacy Builder still runs if needed.
MEOS 1.3 makes three sweeping shape changes that the regenerated
bindings expose:

1. Every predicate that used to return `bool` now returns `int`
   (0/1). 86 call sites under MEOS.NET/Types/ wrapped with
   `!= 0` so each caller's domain bool semantics stays explicit
   instead of hidden under a generated wrapper.

2. Every `bool` parameter (span_make's lower_inc/upper_inc,
   asMfJson's bounding-box flag, temporal_duration's ignoreGaps,
   ...) flipped to `int`. 29 call sites cast `b ? 1 : 0` at the
   boundary.

3. Several helpers renamed (`tfloat_round` ->
   `temporal_round`, `tfloat_derivative` -> `temporal_derivative`,
   `tfloat_ever_eq` -> `ever_eq_tfloat_float` and family,
   `span_width` -> `numspan_width`, `spanset_width` ->
   `numspanset_width`, `distance_tnumber_tnumber` ->
   `tdistance_tnumber_tnumber`, `distance_tfloat_float` ->
   `tdistance_tfloat_float`, `tstzspan_to_tbox` -> `span_to_tbox`,
   `tstzset_to_tbox` -> `set_to_tbox`, `tstzspanset_to_tbox` ->
   `spanset_to_tbox`).  Call sites updated directly; no compat
   aliases in the codegen layer.

Also:

- `meos_initialize` is now zero-argument. `MEOSLifecycle.Initialize`
  splits into three calls (`meos_initialize`,
  `meos_initialize_timezone`, `meos_initialize_error_handler`) and
  marshals the C# error-handler delegate to a function pointer via
  `Marshal.GetFunctionPointerForDelegate`, keeping a static
  reference alive to prevent GC collection while MEOS holds the
  pointer.
- `temporal_from_mfjson` now requires a `meosType` enum; the
  generic `Temporal.FromMFJson(string)` factory cannot pick a
  subtype from JSON content alone, so it was removed.  Subtype-
  specific factories live on subclasses (TemporalBoolean.FromMFJson
  via `tbool_from_mfjson`, TemporalFloat.FromMFJson via
  `tfloat_from_mfjson`).
- `*_from_wkb` size argument is `int` (was `ulong`).  Four call
  sites drop the `(ulong)` cast.
- `tfloatseq_from_base_tstzspan` and the sequence-set form take an
  `interpType` enum; `MEOS.NET.Enums.InterpolationType` casts
  explicitly with `(int)`.

Solution builds 0-error / 0-warning across MEOS.NET,
MEOS.NET.Builder, MEOS.NET.NpgSql, ExampleApp, MEOS.NET.Tests.
build_pymeos_functions.py-style consolidation for the .NET side: when
meos-idl.json carries a shape.arrayReturn with an accessor-style length,
the regenerated wrapper now calls the sibling on the wrapped input and
Marshal.Copy's the result into a managed array (long[] for *set_values,
double[] for floatset_values, int[] for intset_values, IntPtr[] for
*_insts_p / *_sequences_p / spanset_spanarr family).  When shape carries
outputArrays the wrapper allocates one IntPtr buffer per declared
output, calls the external entry, and returns a value-tuple of typed
arrays (e.g. temporal_time_split now returns (IntPtr[], long[]) for the
result + time_bins pair, and tgeo_space_time_split returns
(IntPtr[], IntPtr[], long[])).  All bool returns and bool parameters
gain [MarshalAs(UnmanagedType.U1)] so the LibraryImport source
generator does not promote them to the 4-byte Windows BOOL: MEOS
emits C _Bool (1 byte) on every platform the .NET binding targets.
FloatSet.Values is rewritten to consume the new typed return.
@estebanzimanyi estebanzimanyi requested review from Davichet-e and mschoema and removed request for mschoema May 15, 2026 05:29
Copy link
Copy Markdown
Member

@Davichet-e Davichet-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look once it compiles. In the meantime, a few concerns:

  • There are still calls to renamed/removed functions (tfloat_derivative, tfloat_round, distance_tfloat_float, …). Same for MEOSLifecycle.Initialize, which doesn't match the new meos_initialize() / meos_initialize_error_handler(IntPtr) signatures.
  • Have you considered ClangSharp? It's a .NET-native libclang binding generator (used by .NET runtime itself for Win32/DirectX P/Invokes). Would keep the toolchain in C# and avoid the Python dependency. Could also use a hybrid approach: ClangSharp generates the raw P/Invoke layer from meos.h (guaranteed correct signatures), then a post-processing step enriches it with meos-api.json metadata (doc comments, ownership semantics, grouping) to produce the idiomatic public API. Best of both worlds — accurate FFI from the compiler, semantic knowledge from the catalog.
  • If tools/codegen.py is the new path, can we drop MEOS.NET.Builder/ rather than patch it?
  • Worth adding a CI job that regens + builds + runs tests against MEOS 1.3, so this doesn't drift again.
  • Minor: make DLL_PATH in codegen.py a CLI flag instead of hardcoding "meos".

codegen.py marshals C _Bool as C# bool (UnmanagedType.U1), so the
regenerated predicates return bool and the inclusive/flag parameters
take bool. Drop the now-invalid != 0 on bool-returning calls, pass the
flags as bool instead of (cond ? 1 : 0), and widen the *_from_wkb size
argument to ulong to match size_t. Solution builds 0-error / 0-warning
and ExampleApp round-trips against libmeos.
Builds libmeos from MobilityDB at MOBILITYDB_REF, regenerates the
bindings from MEOS-API's meos-idl.json for the same ref, builds the
solution and runs ExampleApp against the freshly built library so a
future MEOS surface change that breaks the wrappers fails the build
instead of drifting silently. Binding drift is reported but not gated,
since exact regen output also tracks the MEOS-API generator version.
The committed bindings target the current MEOS surface (bool-marshalled),
generated from MEOS-API's feat/shape-metadata branch, not stable-1.3.
Pin MOBILITYDB_REF=master and MEOS_API_REF=feat/shape-metadata so the
regen, libmeos build and runtime smoke all agree, and update tools/README
to match.
PR #3 needs two unmerged MEOS-API changes: shape metadata
(feat/shape-metadata) and the postgres integer-typedef stub that keeps
int64/TimestampTz 64-bit (fix/stdbool-stub, commit fd83b28). Compose
them in the workflow by checking out feat/shape-metadata and
cherry-picking the fix. Revert to MEOS_API_REF=master once both land.
@estebanzimanyi estebanzimanyi changed the title Bump bindings to MEOS 1.3 via meos-idl.json codegen Replace regex Builder with JSON codegen and bump bindings to current MEOS May 16, 2026
@estebanzimanyi
Copy link
Copy Markdown
Member Author

Thanks for the careful look. Addressed in commits on the branch (043c058, 449e711, e08fbe6, 44ad430, 285e6c4, a88a10f), and it surfaced a real upstream bug along the way.

1. Renamed/removed calls + MEOSLifecycle.Initialize. The specific names you cited were already adapted before this review, in commit 0304630: the wrappers call temporal_derivative/temporal_round/tdistance_tfloat_float/tdistance_tnumber_tnumber, and MEOSLifecycle.Initialize is the meos_initialize() + meos_initialize_timezone(tz) + meos_initialize_error_handler(fnPtr) split. You were right that it did not compile, though: the real cause was a systematic int/bool boundary, not the renames. codegen.py marshals C _Bool as C# bool (UnmanagedType.U1), so the regenerated predicates return bool and the inclusive/flag parameters take bool, while the wrappers still did (... != 0) and (cond ? 1 : 0). e08fbe6 fixes all of those (and widens the *_from_wkb size arg to ulong). One more thing this turned up: the PR was mistitled "MEOS 1.3" but its committed bindings are the current MEOS surface, so it is retargeted to current MEOS (title/body updated). Solution now builds 0-error / 0-warning and ExampleApp round-trips against libmeos (WKB serialize/deserialize equality, the ever/always predicates, the error-handler path).

2. ClangSharp. The libclang-accurate FFI you're describing already exists, just factored upstream: MEOS-API generates meos-idl.json with libclang, so signatures are compiler-extracted, and codegen.py is the thin emit step that also consumes the catalog's ownership/array-shape metadata. That catalog is the single source the other bindings already share (PyMEOS, JMEOS, GoMEOS); keeping .NET on it avoids a .NET-only header parse drifting from the rest of the ecosystem. The Python is build-time only; the generated .cs is committed and CI is the only place it runs. Happy to add ClangSharp later as a second in-tree correctness cross-check, but I'd keep the JSON catalog as the semantic source rather than maintain two.

3. Drop MEOS.NET.Builder/. Done in 043c058: directory and its .sln entries removed; tools/codegen.py replaces it.

4. CI. Added in 44ad430/285e6c4/a88a10f: builds libmeos from MobilityDB, regenerates the bindings from MEOS-API for the same ref, builds the solution and runs ExampleApp against the freshly built library. It is green. It caught a real upstream bug immediately: without a real pg_config.h, MEOS-API's libclang parse left int64 undefined, so int64/TimestampTz/Timestamp/DateADT collapsed to 32-bit int in meos-idl.json (corrupting every timestamp signature for all consumers). Root-cause fix is MobilityDB/MEOS-API#1. The workflow stacks on MEOS-API#1 (that fix) and MEOS-API#2 (shape metadata) by cherry-pick until both land on MEOS-API master, at which point MEOS_API_REF reverts to master.

5. DLL_PATH flag. Done in 449e711: --dll-path, default "meos".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants