examples(polars): Polars × PyMEOS TemporalParquet round-trip example (depends on PyMEOS #84) by estebanzimanyi · Pull Request #6 · MobilityDB/PyMEOS-Examples

estebanzimanyi · 2026-05-21T10:50:12Z

Adds a worked-out example of consuming TemporalParquet files from the Polars DataFrame engine, zero-copy via PyMEOS' pymeos.io data-lake interchange layer.

What's in the example

PyMEOS_Examples/Polars_TemporalParquet.py — a single self-contained script demonstrating the full round-trip:

Build a small temporal-point dataset using PyMEOS (3 trips, 4 instants each, EPSG:4326 near Brussels).
Write to TemporalParquet via pymeos.io.write_temporal — opaque MEOS-WKB payload column + native-scalar sidecar columns (<col>__xmin/xmax/ymin/ymax/tmin/tmax) + self-describing temporal footer in the Parquet schema metadata. Byte-compatible with files written by MobilityDuck's temporalFooter() consumer recipe — files are portable across both tools.
Read back with PyMEOS — full TGeomPointSeq object reconstruction.
Consume the SAME file in Polars zero-copy via pl.from_arrow(pyarrow.parquet.read_table(path)). Polars sees the sidecar columns as native primitives, so its lazy / predicate-pushdown machinery works without decoding the MEOS-WKB payload. The temporal column appears as opaque BINARY for analysts who don't need MEOS-aware operations on every column.
Sidecar-driven predicate pushdown — pyarrow.parquet.read_table(filters=[("trip__xmax", "<", 4.45)]) prunes row groups before any per-row decode.

Example shows the dual consumption model that motivates the data-lake layer: PyMEOS for MEOS-aware reads, Polars (or any Arrow-aware engine) for native-column analytics, both reading the same on-disk file.

Install caveat

The pymeos.io module ships in PyMEOS PR #84 (feat/datalake-consumer, OPEN at time of writing). Until #84 merges into PyMEOS master, install PyMEOS from the branch directly:

pip install "git+https://github.com/MobilityDB/PyMEOS.git@feat/datalake-consumer#egg=pymeos[parquet]"
pip install polars pyarrow

After #84 merges, the standard install path works with zero code change:

pip install "pymeos[parquet]" polars pyarrow

The script itself doesn't reference any branch-specific path — only pymeos.io, which is the stable public surface in PR #84.

Why this PR lands now rather than after #84 merges

Two reasons:

The example is the verification that pymeos.io's public surface is genuinely Polars-compatible. Writing it now surfaces any contract gaps while PR #84 is still in review (the script uses to_arrow, from_arrow, write_temporal, read_temporal, temporal_footer — the full public surface).
Adopters get an upfront recipe rather than waiting for a separate follow-up after #84 lands. Once #84 reaches master, the only change needed here is dropping the install caveat from the README.

The README's install instruction is explicit about the dependency, so users hitting the example before #84 lands aren't surprised.

File checklist

PyMEOS_Examples/Polars_TemporalParquet.py — the example script (~200 lines, single file, no other deps)
README.md — one new bullet indexing the example with the install caveat

What's NOT in scope here

Iceberg — Polars composes with Iceberg via pl.scan_iceberg, but that requires a live Iceberg catalog (e.g. Apache Polaris) and is gated on the MobilityDuck temporal_iceberg_scan UDF. Tracked separately per iceberg-readiness memo; an Iceberg-Polars composition example would land as a sibling here once those substrates exist.
Lazy scan_pyarrow_dataset — for multi-file Parquet datasets, but adds complexity without changing the conceptual round-trip. Easy follow-up once adopters ask for it.

Adds PyMEOS_Examples/Polars_TemporalParquet.py demonstrating the zero-copy bridge between PyMEOS' data-lake interchange layer (`pymeos.io`) and the Polars DataFrame engine. Round-trip covered: 1. Build a temporal-point dataset using PyMEOS (3 trips, 4 instants each) 2. Write to TemporalParquet via `pymeos.io.write_temporal` — opaque MEOS-WKB payload + native-scalar sidecar columns + self-describing `temporal` footer (byte-compatible with MobilityDuck's `temporalFooter()` consumer recipe) 3. Read back with PyMEOS — full PyMEOS object reconstruction 4. Consume the SAME file in Polars zero-copy via `pl.from_arrow` — Polars sees sidecar columns as native primitives 5. Sidecar-driven predicate pushdown via `pyarrow.parquet.read_table` `filters=[…]` — row-groups pruned before any per-row decode Depends on the `pymeos.io` module shipping in PyMEOS PR #84 (`feat/datalake-consumer`). Until #84 reaches PyMEOS master, adopters install PyMEOS from the branch directly: pip install "git+https://github.com/MobilityDB/PyMEOS.git@feat/datalake-consumer#egg=pymeos[parquet]" After #84 merges, the standard `pip install pymeos[parquet]` path works without code changes. README updated to index the new example with the install caveat.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

examples(polars): Polars × PyMEOS TemporalParquet round-trip example (depends on PyMEOS #84)#6

examples(polars): Polars × PyMEOS TemporalParquet round-trip example (depends on PyMEOS #84)#6
estebanzimanyi wants to merge 1 commit into
MobilityDB:mainfrom
estebanzimanyi:feat/polars-temporalparquet

estebanzimanyi commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

estebanzimanyi commented May 21, 2026

What's in the example

Install caveat

Why this PR lands now rather than after #84 merges

File checklist

What's NOT in scope here

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant