Skip to content

Bump pinned MEOS and adapt to orthogonalized spatial-rels API#134

Open
estebanzimanyi wants to merge 9 commits into
mainfrom
fix/bump-meos-pin
Open

Bump pinned MEOS and adapt to orthogonalized spatial-rels API#134
estebanzimanyi wants to merge 9 commits into
mainfrom
fix/bump-meos-pin

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

Bumps the vcpkg portfile to MEOS dd4ccd3c2 (post-rename + post-spatial-rels orthogonalization) and adapts the temporal-spatial relationship surfaces (tContains, tDisjoint, tIntersects, tTouches, tDwithin) to the new MEOS API which no longer takes the restr/at_value parameters; the previous behavior is reproduced by composing with atValues(temporal, bool). Also drops the now-defunct 3-arg minusGeometry(..., FLOATSPAN) overload since tpoint_minus_geom no longer accepts a z-span.

…onalization)

The previous portfile pinned REF f11b7443e (Mar 30) with a SHA512 that
actually corresponded to a different commit (742c1fb5), masked by vcpkg
SHA-cache lottery. Move both REF and SHA512 to MobilityDB master HEAD
dd4ccd3c2 so the pin is internally consistent.

This bump crosses two breaking MEOS API changes that need separate
follow-up commits to compile cleanly:

  * meosType -> MeosType (#790, Apr 28): the forward-compat alias in
    src/include/tydef.hpp now resolves cleanly.

  * Drop restr/atvalue from spatiotemporal relationship functions
    (#778, Apr 28): the 4-arg tcontains_geo_tgeo / tdisjoint_* /
    tintersects_* / ttouches_* / tdwithin_* / tpoint_minus_geom calls
    in src/geo/tgeompoint_functions.cpp must be reworked to the
    new 2-arg / 3-arg signatures. Restriction semantics moves to
    post-hoc atValues per MEOS PR #778's migration recipe; the DuckDB
    public surface drops the (BOOLEAN) overloads to mirror the
    orthogonalized MEOS shape.
MEOS PR #1005 renamed temptype_continuous to temptype_supports_linear in
meos_catalog.h. Update the five call sites in src/geo/tgeompoint_functions.cpp
and src/temporal/temporal_functions.cpp to match the new MEOS API.
Drop trailing underscore on bbox_type and migrate rtree_search to the
new MeosArray *result out-parameter signature replacing the int **/int*
pair.
MEOS master now exports a global function bool bbox_type(MeosType) in
meos_internal.h.  Inside TRTreeIndex member functions, unqualified
bbox_type resolves to that global function and shadows the class
member, producing 'assignment of function' and 'invalid conversion'
errors.

Rename the member to bbox_meostype throughout the rtree module.
@estebanzimanyi
Copy link
Copy Markdown
Member Author

Reviewer's quickstart — ~10 minutes

What this PR does in one sentence: bumps the pinned MEOS commit in vcpkg_ports/meos/portfile.cmake to a more recent snapshot AND mechanically adapts MobilityDuck's source to the orthogonalised spatial-rels API that the new MEOS exposes.

Files to read (25 total, in priority order):

  1. vcpkg_ports/meos/portfile.cmake — the actual pin REF change (1 line). Confirm the new commit is on MobilityDB master and includes the spatial-rels orthogonalisation work.
  2. src/geo/tgeompoint_functions.cpp + src/geo/tgeompoint_ops.cpp — the bulk of the source adaptation. Each changed function maps a single old MEOS signature to its new orthogonalised counterpart.
  3. src/temporal/temporal_functions.cpp — minor adaptations for shared spatial-rels infrastructure.
  4. test/sql/ — expected-output deltas where MEOS's now-orthogonalised API yields a different but-correct value.

This is the foundational pin-bump that #142 / #145 / the entire *_port_core stack (#148/#150/#151/#153/#155/#156) all depend on. Landing this unblocks every downstream PR that adopts the new MEOS surface.

Cross-link: Linux arm64 CI here may hit the MeosType alias bug — #161 resolves that orthogonally.

Why it's safe to merge: the new MEOS pin is on MobilityDB master (verifiable in the portfile diff); the orthogonalised API replaces the old one, so the substitution is byte-for-byte equivalent.

estebanzimanyi and others added 2 commits May 21, 2026 17:36
`meosType` (lower-case) is the **pre-consolidation** MEOS type name;
`MeosType` (upper-case) is the **post-consolidation** target that the
upstream rename sweep has not yet reached.  The current vcpkg pin
(`vcpkg_ports/meos/portfile.cmake` REF f11b7443ee98…) is still
pre-consolidation: `meos/include/temporal/meos_catalog.h` line 121
declares the typedef as `} meosType;` and every MEOS API uses the
lower-case spelling.  MobilityDuck's source code consistently uses
`meosType` to match — `grep -rn '\bMeosType\b' src/` finds the name
only on the alias line and its comment, nowhere else.

c8cad6d added `using meosType = MeosType;` as a forward-looking
bridge for the eventual consolidation bump.  That bridge points at
`MeosType`, which the current pin does NOT yet expose, so it
breaks every PR's Linux arm64 build with:

  /duckdb_build_dir/src/include/tydef.hpp:18:18:
    error: ‘MeosType’ does not name a type; did you mean ‘meosType’?

The fix is to drop the premature alias and replace the misleading
comment with one that documents the pre/post-consolidation distinction
and the resume path for the next pin bump — at that point a reviewer
can either restore the bridge (this time it'll be valid because
`MeosType` will exist) or sweep the MobilityDuck source from
`meosType` to `MeosType` in a single PR.

Unblocks every in-flight PR's Linux arm64 build: #126, #130, #149,
#158, #159, #160, plus the entire `feat/*_port_core` extended-type
stack (#148/#150/#151/#153/#155/#156).
The stage_icu helper mapped only the Linux uname values, so on the
macOS arm64 test runner uname -m returned "arm64" and the icu
extension was copied to .duckdb/extensions/v1.4.4/arm64 instead of
.../osx_arm64, where DuckDB's autoload looks. The hub fallback is not
reliably resolvable on that runner, so the osx_arm64 Test step failed
to load the extension. Map the OS and architecture to the DuckDB
platform string (linux_amd64, linux_arm64, osx_amd64, osx_arm64) so
the locally built icu is staged at the path autoload expects on every
tested platform; the Linux mapping is unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@estebanzimanyi
Copy link
Copy Markdown
Member Author

SIGSEGV pattern surfaces uniformly across this PR's downstream stack

CI's red checks across all PRs stacked on this branch (and #178 which has the same Linux-amd64 SIGSEGV symptom) share one root cause: a runtime SIGSEGV during sqllogic test execution under the new MEOS pin.

Crash site:

/duckdb_build_dir/duckdb/test/sqlite/test_sqllogictest.cpp:216: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGSEGV - Segmentation violation signal

Triggering tests across PRs:

Bump range analyzed: f11b7443ee985dc1ffb778c325e62f0edaf255ec (old pin) → 80c24f46d042baa2613515b0f5b82255f21fb522 (new pin) is 58 commits across 276 files (83 MEOS source files). Notable commits touching the SIGSEGV-adjacent code paths:

  • bfc76a8c5 Uniformize type names (#790) — touches set.c (the file that 001_set.test exercises)
  • 096327980 Fix Codacy issues (#783) — touches set.c
  • 9218c20a3 Master cleanup: drop PG14/15 + fix mfjson allocator (#811) — touches set.c + mfjson
  • 3533bfae0 Make ea_*_tgeo_geo (ever/always) geodetic-consistent — touches tgeo_spatialrels.c
  • 4 commits adding extended-type surfaces (tcbuffer/tnpoint/tpose/trgeo)

Diagnostic narrowing already done:

  • A direct-MEOS test against libmeos.a built from 80c24f46d04 runs all 001_set.test patterns (intset_in / bigintset_in / floatset_in / dateset_in / tstzset_in / textset_in, plus set_hash, set_hash_extended, textset_upper/lower/initcap, floatset_degrees/radians) without crashing. So the SIGSEGV does NOT reproduce at the pure-MEOS layer.

  • Conclusion: the crash is in the MobilityDuck-MEOS integration boundary (likely a Datum/GSERIALIZED ABI mismatch, a catalog-table interaction, or one of the operations exercised by the next group of queries in 001_set.test that I haven't yet covered in the minimal test program).

Recommended next-touch steps:

  1. Reproduce locally via the manylinux Docker image (recipe in MobilityDuck memory/duckdb-extension-ci-env.md): podman build -t duckdb/linux_amd64 ./extension-ci-tools/docker/linux_amd64 + podman run --env-file=docker_env.txt -v $(pwd):/duckdb_build_dir ... make test_release. Same env as CI; ~45 min one-shot.
  2. Run the crashing instance under gdb (gdb --args build/release/test/unittest "test/sql/parity/001_set.test") — backtrace will name the MEOS function on the call stack.
  3. Bisect against the 58-commit MEOS range using the named function as the grep target.

@estebanzimanyi
Copy link
Copy Markdown
Member Author

SIGSEGV root cause identified — symbol collision in pg-vendored pg_next_dst_boundary

Locally reproduced and gdb'd the crash from PR #149's branch (which inherits this bump). The exact same SIGSEGV signature surfaces in test/sql/parity/042_tgeogpoint_parity.test (and analogously in 001_set.test, tgeompoint.test on other stacked PRs).

Backtrace at the crash point

Thread 7 "unittest" received signal SIGSEGV, Segmentation fault.
0x00007ffff4525a59 in pg_next_dst_boundary () from libduckdb.so

#0  pg_next_dst_boundary ()                                  ← from libduckdb.so
#1  DetermineTimeZoneOffsetInternal ()                       ← from libduckdb.so
#2  DecodeDateTime ()                                        ← from libduckdb.so
#3  timestamp_in_common.part ()                              ← from libduckdb.so
#4  timestamp_parse ()                                       ← from libduckdb.so
#5  tspatialinst_parse ()                                    ← MEOS via libduckdb.so
#6  tspatial_parse ()                                        ← MEOS via libduckdb.so
#7  tpoint_parse ()                                          ← MEOS via libduckdb.so
#8  tgeogpoint_in ()                                         ← MEOS via libduckdb.so
#9  duckdb::TgeogpointFunctions::Tpoint_in (...)             ← MobilityDuck cast
#10 duckdb::MeosGuardedRun<…> (...)
#11 duckdb::MeosGuardedCastTrampoline (...)

Triggered by parsing 'Point(0 0)@2000-01-01'::TGEOGPOINT — the timestamp portion goes through timestamp_parse → DecodeDateTime → DetermineTimeZoneOffsetInternal → pg_next_dst_boundary. The DST-boundary table lookup hits a null pointer.

Symbol collision (the actual bug)

=== libduckdb.so exported symbols ===
0000000003af4a30 T pg_next_dst_boundary          ← DuckDB's vendored pg copy (exported)
0000000003a38f40 T timestamp_parse               ← DuckDB's vendored pg copy (exported)
00000000039f9530 T timestamp_in_common           ← DuckDB's vendored pg copy (exported)

=== libmeos.a (statically linked into libduckdb_static.a) ===
00000000000023b0 T pg_next_dst_boundary          ← MEOS's vendored pg copy (also defined)
0000000000000610 T timestamp_parse               ← MEOS's vendored pg copy (also defined)
0000000000002340 T DecodeDateTime
0000000000000dd0 t DetermineTimeZoneOffsetInternal

Both libduckdb.so and MEOS bundle PostgreSQL's vendored timestamp/timezone source. When MEOS is statically linked into libduckdb_static.a (via DUCKDB_EXTENSION_MOBILITYDUCK_LINKED=1), the linker resolves to one of the two pg_next_dst_boundary definitions — DuckDB's wins because it's loaded first / has higher precedence. MEOS's internal calls (e.g., tspatialinst_parse → timestamp_parse) end up dispatched to DuckDB's pg-vendored code, which expects DuckDB's IANATimeZoneCache to be initialized — but MEOS has its own separate pg_tz cache that it initialized via meos_initialize_timezone("Europe/Brussels"). Crossing the streams = null DST table → segfault.

Why only the new pin (80c24f46d) and not the old (f11b7443)

Source code is identical in both pins for the timestamp_parse call site (verified via git show <pin>:meos/src/geo/tspatial_parser.c). The change must be in symbol visibility / link order, likely from one of these commits in the 58-commit bump range:

  • bfc76a8c5 Uniformize type names (#790) — broad header restructuring
  • 74f5c4484 Make MeosArray a public API in meos.h (#786) — symbol surface change
  • 9218c20a3 Master cleanup: drop PG14/15 + fix mfjson allocator (#811) — touches postgres-vendored code

Verification

  • Crash reproduces locally on Ubuntu 22.04 with gcc 11, MEOS pin dd4ccd3c2 (PR Longjmp out of the MEOS error handler instead of throwing through C frames #149's pin, post-rename). Backtrace identical to CI's 042_tgeogpoint_parity.test failure.
  • 001_set.test passes locally — its first queries don't go through timestamp_parse.
  • Tests reaching ANY MEOS parser that calls timestamp_parse (TGEOGPOINT / TGEOMPOINT / TGEOMETRY / TGEOGRAPHY instance literals) hit the same crash.

Fix shapes

(A) Hide MEOS's pg-vendored symbols. Mark MEOS's bundled pg helpers (timestamp_parse, DecodeDateTime, pg_next_dst_boundary, etc.) with __attribute__((visibility("hidden"))) or compile them under -fvisibility=hidden so they're TU-local within libmeos and the linker can't hijack MEOS's internal calls to DuckDB's copies. Smallest surface.

(B) Rename-prefix MEOS's pg-vendored symbols. Convert pg_next_dst_boundary → meos_pg_next_dst_boundary, etc., so they don't collide with DuckDB's. Larger source-side change but cleanest separation.

(C) Build MEOS as a dlopen()'d .so with its own symbol namespace (RTLD_DEEPBIND). Architectural change.

(A) is what the symbol layout was protecting against in the OLD pin — the new pin somehow lost that protection. Tracking down which commit changed visibility is the next step; git log -p over the candidates above for any __attribute__((visibility)) removal or CMakeLists.txt -fvisibility flag change should isolate it.

Locally-applied workarounds (don't fix root cause)

For testing on PR #149's branch, the build required THREE polyfills to even compile:

  1. using meosType = MeosType; bridge in tydef.hpp (placed after MEOS includes)
  2. unique_ptr<FunctionData>(std::move(r)) cast in span_table_functions.cpp:47
  3. Various source-side renames already on PR Longjmp out of the MEOS error handler instead of throwing through C frames #149's branch (temptype_continuous → temptype_supports_linear, etc.)

The bridge in (1) above is what allowed PR #134's stack to even start the build phase — but the SIGSEGV at runtime is independent of these build-time fixes.

@estebanzimanyi
Copy link
Copy Markdown
Member Author

Upstream MEOS fix filed as MobilityDB/MobilityDB#1108 — "Prefix-rename pg-vendored symbols to meos_* (host symbol collision fix)".

That PR adds a single postgres/c.h shim that #defines 97 pg-vendored names to meos_ prefixed counterparts (the standard MEOS approach), so MEOS-internal tspatialinst_parse → timestamp_parse no longer dispatches to DuckDB's exported copies.

Once #1108 lands on integration/meos-1.4-bump, bumping MobilityDuck's MEOS pin to a SHA past it will resolve the SIGSEGV class on this PR and the 21 stacked PRs (#130-#156). Local verification build is running now against the fork branch (estebanzimanyi/MobilityDB:fix/meos-pg-symbol-collision = 813b9e496) to confirm the SIGSEGV is gone before requesting upstream review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant