From 5814770faefc5dd08605aeda8a7743bf9bc39b61 Mon Sep 17 00:00:00 2001 From: Esteban Zimanyi Date: Mon, 18 May 2026 17:51:56 +0200 Subject: [PATCH] Gate the portable bare-name dialect on the GoMEOS binding Add the canonical portable bare-name dialect (RFC #920) parity gate so GoMEOS exposes the same operator -> bareName surface as every other MobilityDB binding/engine, verified by construction across all six in-scope user-facing type families. Stacks on the MEOS 1.4 bump (PR #3, bump/meos-1.4): the IDL-driven codegen there emits tools/_preview/*.go from tools/meos-idl.json (the MEOS-API parser's JSON output), wrapping every operator's own backing C function for temporal, geo, cbuffer, npoint, pose and rgeo. - tools/portable-aliases.json: the 29-pair contract, vendored byte-identical from MEOS-API meta/portable-aliases.json (single source of truth); --idl prefers the catalog's folded-in copy once present. - tools/portable_parity.py: audits the exposed CGO symbol set -- every C.( in the hand-written root *.go AND the IDL-driven tools/_preview/*.go. A bare name is backed iff some referenced MEOS symbol == bareName or startswith(bareName + "_"), with the contract's verified explicitBacking (nearestApproachDistance <- nad_*). Reads JSON-derived artifacts only -- never parses meos.h. - tools/parity/parity_test.go: pure-Go, no import "C" -- a language-independent mirror with an identical verdict that runs in CI with no libmeos/CGO toolchain. - .github/workflows/portable-parity.yml: runs both gates on push/PR. - tools/PORTABLE_ALIASES.md: documents the dialect and the gate. Result: 29/29 bare names backed, 0 unbacked, all six in-scope families covered (temporal, geo, cbuffer, npoint, pose, rgeo) -- 100% parity. The gate hard-fails on any in-scope family that is present in the surface but unbacked, or absent: cbuffer/npoint/pose/rgeo are full user-facing types and are never excluded from the parity headline. The pre-1.4 tdistance version bridge is now inert -- MEOS 1.4 provides the canonical tdistance_* prefix. --- .github/workflows/portable-parity.yml | 40 ++++ tools/.gitignore | 3 + tools/PORTABLE_ALIASES.md | 93 +++++++++ tools/parity/parity_test.go | 269 ++++++++++++++++++++++++++ tools/portable-aliases.json | 60 ++++++ tools/portable_parity.py | 257 ++++++++++++++++++++++++ 6 files changed, 722 insertions(+) create mode 100644 .github/workflows/portable-parity.yml create mode 100644 tools/PORTABLE_ALIASES.md create mode 100644 tools/parity/parity_test.go create mode 100644 tools/portable-aliases.json create mode 100644 tools/portable_parity.py diff --git a/.github/workflows/portable-parity.yml b/.github/workflows/portable-parity.yml new file mode 100644 index 0000000..7574d86 --- /dev/null +++ b/.github/workflows/portable-parity.yml @@ -0,0 +1,40 @@ +name: portable-aliases parity + +# Gates the portable bare-name dialect (RFC #920; MEOS-API cross-repo +# handoff PR #9): GoMEOS's exposed CGO symbol set — the hand-written root +# package plus the IDL-driven generated surface (tools/_preview, emitted +# by tools/codegen.py from tools/meos-idl.json, the MEOS-API parser's +# JSON output) — must remain a superset of portableAliases.bareNames: +# 29/29, 0 unbacked, and every one of the six in-scope user-facing type +# families (temporal, geo, cbuffer, npoint, pose, rgeo) covered. +# cbuffer/npoint/pose/rgeo are full temporal types and are never excluded +# from the parity headline. +# +# Neither job needs libmeos/CGO: both audit the source statically, so the +# gate is self-contained and cannot be flaked by the C toolchain. + +on: + push: + pull_request: + +jobs: + parity: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Parity gate (script — exposed symbol set ⊇ bareNames) + run: python3 tools/portable_parity.py --check + + - uses: actions/setup-go@v5 + with: + go-version: '1.23' + + - name: Parity gate (Go test — language-independent mirror) + env: + CGO_ENABLED: '0' + run: go test ./tools/parity/ -v diff --git a/tools/.gitignore b/tools/.gitignore index c18dd8d..31d89b1 100644 --- a/tools/.gitignore +++ b/tools/.gitignore @@ -1 +1,4 @@ __pycache__/ + +# Generated by portable_parity.py (the gate writes it; do not commit) +portable-parity.report.json diff --git a/tools/PORTABLE_ALIASES.md b/tools/PORTABLE_ALIASES.md new file mode 100644 index 0000000..5c879c5 --- /dev/null +++ b/tools/PORTABLE_ALIASES.md @@ -0,0 +1,93 @@ +# GoMEOS portable bare-name parity (RFC #920) + +The MobilityDB ecosystem defines a canonical, type-agnostic +operator → bare-name dialect so one program/query runs identically on +every engine and binding — a user learns one reference and assumes the +rest. The contract is **29 operator → bareName pairs** across eight +families (topology, time-position, space X/Y/Z, temporal-comparison, +distance, same); the single source of truth is MEOS-API +`meta/portable-aliases.json` (discussion MobilityDB#861 · RFC #920 · +native MobilityDB#1075 · manual MobilityDB#1078 · cross-repo handoff +MEOS-API PR #9). + +`tools/portable-aliases.json` is that contract, vendored +**byte-identical**. (`tools/portable_parity.py --idl tools/meos-idl.json` +prefers the catalog's folded-in `portableAliases` automatically once a +re-vendored MEOS-API catalog carries it; the MEOS 1.4 catalog vendored +today predates the MEOS-API #8 fold-in, so the SoT copy is used by +default and the gate stays self-contained.) + +## Everything is leveraged from the MEOS-API JSON — never header parsing + +This gate consumes only JSON-derived artifacts: + +- **The contract** is MEOS-API's `meta/portable-aliases.json` (JSON), + vendored byte-identical. +- **The binding surface** it measures is `tools/_preview/*.go`, which + `tools/codegen.py` generates from `tools/meos-idl.json` — the output + of the MEOS-API parser. This is the exact GoMEOS analogue of MEOS.NET + gating its generated P/Invoke file and PyMEOS/JMEOS gating their + codegen: every binding derives the *same* dialect from the *one* + catalog. + +The gate's only source-level step is reading `C.(` tokens from +the generated/hand-written Go to learn which MEOS functions the binding +actually references. It does **not** parse `meos.h`; the C surface it +trusts is the one the MEOS-API JSON described. + +## What "backed" means + +GoMEOS wraps MEOS directly through CGO, and MEOS C already names every +operator's backing function `...` (`&&` → `overlaps_*`, +`#<` → `tlt_*`, `~=` → `same_*`, `<->` → `tdistance_*`), so the binding +exposes the dialect **by construction**: every portable name reuses the +operator's *own* backing C function, never a reimplementation, with no +type-qualified or per-binding form. + +A bare name is **backed** when the exposed CGO symbol set (every +`C.(` in the hand-written root `*.go` and the IDL-driven +`tools/_preview/*.go`) contains a MEOS function whose name `== bareName` +or `startswith(bareName + "_")`, with one verified, *non-guessed* +fallback from the contract: `nearestApproachDistance` (`|=|`) ← the +`nad_*` family (`explicitBacking`). + +The `tdistance` (`<->`) version bridge from the pre-1.4 era +(`distance_t*`) is now **inert**: MEOS 1.4 — vendored here via GoMEOS +PR #3 — provides the canonical `tdistance_*`, so `tdistance` resolves +directly by prefix. The bridge is retained only as a documented safety +net if the gate is pointed at an older header. + +## Six-family scope — fully covered + +`temporal`, `geo`, `cbuffer`, `npoint`, `pose`, `rgeo` are **all full +user-facing temporal types** and are **never excluded from the parity +headline**. Stacked on the MEOS 1.4 bump (GoMEOS PR #3), the IDL-driven +generated surface backs every one of the 29 bare names across **all +six** families. The gate hard-fails on any family that is *present in +the surface but unbacked* (a real exclusion — `regressed`) or *absent* +(`pending`); neither is tolerated. + +Live result: **29 / 29 bare names backed, 0 unbacked, six / six families +covered** (`temporal`, `geo`, `cbuffer`, `npoint`, `pose`, `rgeo`). + +## Verifying parity + +```sh +python3 tools/portable_parity.py --check # writes tools/portable-parity.report.json +CGO_ENABLED=0 go test ./tools/parity/ -v # language-independent mirror +``` + +Both exit non-zero unless **29/29 bare names are backed, 0 unbacked, +all six in-scope families covered**. They are byte-for-byte equivalent +in verdict (the Go test mirrors the Python script — the analogue of +MEOS-API's `portable_parity.py`, MobilityDB's +`tools/portable_aliases/generate.py --check`, and the MEOS.NET / PyMEOS +parity gates). The same two checks run in the `portable-aliases parity` +CI workflow; neither needs libmeos or CGO. + +## Provenance + +Discussion MobilityDB#861 · RFC #920 · native MobilityDB#1075 · manual +MobilityDB#1078 · MEOS-API cross-repo handoff PR #9. Stacks on GoMEOS +PR #3 (`bump/meos-1.4`), which vendors the MEOS 1.4 headers and the +IDL-driven codegen. diff --git a/tools/parity/parity_test.go b/tools/parity/parity_test.go new file mode 100644 index 0000000..f103e09 --- /dev/null +++ b/tools/parity/parity_test.go @@ -0,0 +1,269 @@ +// Package parity is the language-independent mirror of +// tools/portable_parity.py — the GoMEOS portable bare-name parity gate +// (RFC #920; MEOS-API cross-repo handoff PR #9). +// +// A binding is done when its exposed symbol set ⊇ portableAliases.bareNames, +// verified with the same prefix logic as MEOS-API portable_parity.py: a +// bare name is backed iff some referenced MEOS symbol == bareName or +// startsWith(bareName + "_"), falling back to the contract's verified +// explicitBacking prefixes (nearestApproachDistance ← the nad_* family). +// 0 unbacked, no per-binding exceptions, across all six in-scope +// user-facing type families (temporal, geo, cbuffer, npoint, pose, rgeo) +// — cbuffer/npoint/pose/rgeo are never excluded from the parity headline. +// +// "Exposed symbol set" = every MEOS C function the CGO layer references +// (C.( ) in the hand-written root package AND the IDL-driven +// generated surface under tools/_preview (emitted by tools/codegen.py +// from tools/meos-idl.json — the MEOS-API parser's JSON output). The +// operators' own backing functions, reused by construction, never +// reimplemented. +// +// This package has no `import "C"`, so the gate runs in CI with no +// libmeos/CGO toolchain, exactly like MEOS.NET's managed test mirror. +// Its verdict is identical to the Python script's, by construction. +package parity + +import ( + "encoding/json" + "os" + "path/filepath" + "regexp" + "sort" + "strings" + "testing" +) + +// inScopeFamilies — full user-facing temporal type families. +// cbuffer/npoint/pose/rgeo are NOT internals and are never excluded from +// the parity headline. +var inScopeFamilies = []string{ + "temporal", "geo", "cbuffer", "npoint", "pose", "rgeo", +} + +// bindingBacking is inert at MEOS 1.4 (canonical tdistance_* is present); +// retained as a documented safety net for older/partial header scans. +var bindingBacking = map[string][]string{ + "tdistance": {"distance_tfloat", "distance_tint", + "distance_tnumber", "distance_tpoint"}, +} + +var cgoRe = regexp.MustCompile(`\bC\.([A-Za-z_]\w*)\s*\(`) + +var cgoPseudo = map[string]bool{ + "CString": true, "CBytes": true, "GoString": true, "GoBytes": true, + "free": true, "malloc": true, "calloc": true, +} + +func repoRoot(t *testing.T) string { + t.Helper() + dir, err := os.Getwd() + if err != nil { + t.Fatalf("getwd: %v", err) + } + for { + if _, err := os.Stat(filepath.Join(dir, "go.mod")); err == nil { + return dir + } + parent := filepath.Dir(dir) + if parent == dir { + t.Fatal("could not locate repo root (go.mod)") + } + dir = parent + } +} + +func scanDir(t *testing.T, dir string, syms map[string]bool) int { + t.Helper() + entries, err := os.ReadDir(dir) + if err != nil { + return 0 // optional source (e.g. tools/_preview absent) + } + files := 0 + for _, e := range entries { + n := e.Name() + if e.IsDir() || !strings.HasSuffix(n, ".go") || + strings.HasSuffix(n, "_test.go") { + continue + } + b, err := os.ReadFile(filepath.Join(dir, n)) + if err != nil { + t.Fatalf("read %s: %v", n, err) + } + for _, m := range cgoRe.FindAllStringSubmatch(string(b), -1) { + if !cgoPseudo[m[1]] { + syms[m[1]] = true + } + } + files++ + } + return files +} + +// exposedSymbols = hand-written root *.go ∪ IDL-driven tools/_preview/*.go. +func exposedSymbols(t *testing.T, repo string) map[string]bool { + t.Helper() + syms := map[string]bool{} + scanDir(t, repo, syms) + scanDir(t, filepath.Join(repo, "tools", "_preview"), syms) + if len(syms) == 0 { + t.Fatal("no CGO symbols found — repo layout changed?") + } + return syms +} + +type pair struct{ op, bare, fam string } + +func contract(t *testing.T, repo string) ([]pair, map[string][]string) { + t.Helper() + p := filepath.Join(repo, "tools", "portable-aliases.json") + b, err := os.ReadFile(p) + if err != nil { + t.Fatalf("vendored portable-aliases SoT missing: %v", err) + } + var doc struct { + Families map[string][]struct { + Operator string `json:"operator"` + BareName string `json:"bareName"` + } `json:"families"` + ExplicitBacking map[string][]string `json:"explicitBacking"` + } + if err := json.Unmarshal(b, &doc); err != nil { + t.Fatalf("parse contract: %v", err) + } + var pairs []pair + for fam, lst := range doc.Families { + for _, e := range lst { + pairs = append(pairs, pair{e.Operator, e.BareName, fam}) + } + } + sort.Slice(pairs, func(i, j int) bool { + return pairs[i].bare < pairs[j].bare + }) + return pairs, doc.ExplicitBacking +} + +func matches(symbols map[string]bool, prefix string) []string { + var hits []string + for s := range symbols { + if s == prefix || strings.HasPrefix(s, prefix+"_") { + hits = append(hits, s) + } + } + return hits +} + +func backing(bare string, symbols map[string]bool, + explicit map[string][]string) (hits []string, via string) { + if h := matches(symbols, bare); len(h) > 0 { + return h, "prefix" + } + for _, pref := range explicit[bare] { + hits = append(hits, matches(symbols, pref)...) + } + if len(hits) > 0 { + return hits, "explicit:" + strings.Join(explicit[bare], ",") + } + for _, pref := range bindingBacking[bare] { + hits = append(hits, matches(symbols, pref)...) + } + if len(hits) > 0 { + return hits, "version-bridge:" + + strings.Join(bindingBacking[bare], ",") + } + return nil, "" +} + +func familyOf(name string) string { + n := strings.ToLower(name) + switch { + case strings.Contains(n, "rgeo"): + return "rgeo" + case strings.Contains(n, "cbuffer"): + return "cbuffer" + case strings.Contains(n, "npoint"): + return "npoint" + case strings.Contains(n, "pose"): + return "pose" + case strings.Contains(n, "geo"), strings.Contains(n, "geom"), + strings.Contains(n, "geog"), strings.Contains(n, "point"), + strings.Contains(n, "spatial"): + return "geo" + default: + return "temporal" + } +} + +// TestExposedApiSupersetOfPortableBareNames is the Go mirror of +// tools/portable_parity.py --check: the exposed CGO symbol set (root + +// IDL-driven tools/_preview) must be a superset of +// portableAliases.bareNames (29/29, 0 unbacked) AND every in-scope +// user-facing family must be covered. cbuffer/npoint/pose/rgeo are full +// types and are never excluded from the parity headline; a family present +// in the surface but unbacked (regressed) or absent (pending) is a hard +// failure. +func TestExposedApiSupersetOfPortableBareNames(t *testing.T) { + repo := repoRoot(t) + symbols := exposedSymbols(t, repo) + pairs, explicit := contract(t, repo) + + if len(pairs) != 29 { + t.Fatalf("contract must carry exactly 29 operator→bareName "+ + "pairs, got %d", len(pairs)) + } + + famsPresent := map[string]bool{} + for s := range symbols { + famsPresent[familyOf(s)] = true + } + + famTotals := map[string]int{} + for _, f := range inScopeFamilies { + famTotals[f] = 0 + } + var unbacked []string + for _, p := range pairs { + hits, via := backing(p.bare, symbols, explicit) + if len(hits) == 0 { + unbacked = append(unbacked, + p.bare+" ("+p.op+", "+p.fam+")") + continue + } + t.Logf("backed: %-24s %-5s via %s (%d symbols)", + p.bare, p.op, via, len(hits)) + for _, h := range hits { + if _, ok := famTotals[familyOf(h)]; ok { + famTotals[familyOf(h)]++ + } + } + } + + if len(unbacked) != 0 { + t.Fatalf("unbacked canonical bare names (exposed symbol set must "+ + "be a superset, 0 unbacked): %s", + strings.Join(unbacked, ", ")) + } + + var regressed, pending []string + for _, f := range inScopeFamilies { + switch { + case famTotals[f] > 0: // covered + case famsPresent[f]: + regressed = append(regressed, f) + default: + pending = append(pending, f) + } + } + if len(regressed) != 0 { + t.Fatalf("in-scope families present in the surface but unbacked "+ + "(cbuffer/npoint/pose/rgeo are never excluded from the "+ + "parity headline): %s; coverage=%v", + strings.Join(regressed, ", "), famTotals) + } + if len(pending) != 0 { + t.Fatalf("in-scope user-facing families absent from the exposed "+ + "surface (never excluded from the parity headline): %s; "+ + "coverage=%v", strings.Join(pending, ", "), famTotals) + } + t.Logf("PASS: 29/29 bare names backed, 0 unbacked, all six in-scope "+ + "families covered: %v", famTotals) +} diff --git a/tools/portable-aliases.json b/tools/portable-aliases.json new file mode 100644 index 0000000..1cabac1 --- /dev/null +++ b/tools/portable-aliases.json @@ -0,0 +1,60 @@ +{ + "_comment": "Canonical portable bare-name dialect — the single codegen source of truth (RFC #920). Every binding/engine generates the SAME bare names from this mapping so users learn one reference and assume the rest. Operators are SQL operator symbols; bareName is the portable function name. The mapping is type-agnostic: it applies to EVERY temporal type family.", + "provenance": { + "discussion": "MobilityDB#861", + "rfc": "MobilityDB RFC #920 (doc/rfc/sql-portability/README.md, branch rfc/sql-portability)", + "nativePR": "MobilityDB#1075 (1303 operator-overload aliases, each reusing the operator's own C symbol — identical by construction; CI-gated by tools/portable_aliases/generate.py --check)", + "manualChapter": "MobilityDB#1078" + }, + "families": { + "topology": [{"operator": "&&", "bareName": "overlaps"}, + {"operator": "@>", "bareName": "contains"}, + {"operator": "<@", "bareName": "contained"}, + {"operator": "-|-", "bareName": "adjacent"}], + "timePosition": [{"operator": "<<#", "bareName": "before"}, + {"operator": "#>>", "bareName": "after"}, + {"operator": "&<#", "bareName": "overbefore"}, + {"operator": "#&>", "bareName": "overafter"}], + "spaceX": [{"operator": "<<", "bareName": "left"}, + {"operator": ">>", "bareName": "right"}, + {"operator": "&<", "bareName": "overleft"}, + {"operator": "&>", "bareName": "overright"}], + "spaceY": [{"operator": "<<|", "bareName": "below"}, + {"operator": "|>>", "bareName": "above"}, + {"operator": "&<|", "bareName": "overbelow"}, + {"operator": "|&>", "bareName": "overabove"}], + "spaceZ": [{"operator": "<>", "bareName": "back"}, + {"operator": "&", "bareName": "overback"}], + "temporalComparison": [{"operator": "#=", "bareName": "teq"}, + {"operator": "#<>", "bareName": "tne"}, + {"operator": "#<", "bareName": "tlt"}, + {"operator": "#<=", "bareName": "tle"}, + {"operator": "#>", "bareName": "tgt"}, + {"operator": "#>=", "bareName": "tge"}], + "distance": [{"operator": "<->", "bareName": "tdistance"}, + {"operator": "|=|", "bareName": "nearestApproachDistance"}], + "same": [{"operator": "~=", "bareName": "same"}] + }, + "alreadyCanonical": [ + {"family": "ever", "operators": ["?="], "pattern": "ever_*"}, + {"family": "always", "operators": ["%="], "pattern": "always_*"}, + {"functions": ["eIntersects", "atTime", "restriction functions", + "spatial-relationship functions"]} + ], + "_explicitBackingComment": "Bare names whose MEOS C family prefix differs from the bare name itself. Verified against the catalog (not guessed): `nearestApproachDistance` is backed by the `nad_*` family (35 functions). Lets the parity audit resolve 100% honestly instead of false-flagging a real, present family.", + "explicitBacking": { + "nearestApproachDistance": ["nad"] + }, + "scope": { + "inScopeTypeFamilies": ["temporal", "geo", "cbuffer", "npoint", "pose", + "rgeo"], + "note": "cbuffer / npoint / pose / rgeo are FULL user-facing temporal types and ARE in scope — covered like every other type. PR #1075 already aliases all six families (1303 aliases). They must NOT be excluded from any parity headline; an upstream/audit note that 'defers' or 'jointly excludes' them is a known error being corrected — where another engine defers them, that is incomplete work to close (a gap with a plan), never an accepted exclusion." + }, + "notes": [ + "Generate aliases by reusing each operator's own backing C function (equivalence by construction), never by reimplementing; mirror MobilityDB tools/portable_aliases/generate.py + its 100%-coverage audit.", + "User-facing API uses the full name `trgeometry`; internal functions keep the `trgeo_` prefix — do NOT normalize the internal prefix.", + "Goal: 100% parity ecosystem-wide — every operator has its bare name on every engine, no gaps, no headline exclusions." + ] +} diff --git a/tools/portable_parity.py b/tools/portable_parity.py new file mode 100644 index 0000000..2ab39ad --- /dev/null +++ b/tools/portable_parity.py @@ -0,0 +1,257 @@ +#!/usr/bin/env python3 +"""Portable bare-name parity gate for GoMEOS. + +The GoMEOS analogue of MEOS-API's portable_parity.py, MobilityDB's +`tools/portable_aliases/generate.py --check`, and the MEOS.NET / PyMEOS +parity gates. Per the cross-repo handoff (MEOS-API PR #9): a binding is +done when its **exposed symbol set ⊇ portableAliases.bareNames**, verified +with the *same prefix logic* as MEOS-API portable_parity.py, **0 +unbacked**, no per-binding exceptions, across all six in-scope type +families. + +"Exposed symbol set" for GoMEOS = every MEOS C function the CGO layer +references (`C.(`): + + * the hand-written package at the repo root (`*.go`), and + * the IDL-driven generated surface under `tools/_preview/*.go`, which + `tools/codegen.py` emits from `tools/meos-idl.json` — the MEOS-API + parser's JSON output. Measuring the generated surface is the exact + GoMEOS analogue of MEOS.NET gating its generated P/Invoke file and + PyMEOS/JMEOS gating their codegen: every binding derives the same + dialect from the one catalog, so coverage is leveraged from the + MEOS-API JSON end to end — never by parsing `meos.h`. + +GoMEOS wraps MEOS directly, so these are the operators' *own* backing C +functions, reused by construction — never a reimplementation. A bare name +is *backed* iff some referenced symbol `== bareName` or +`startswith(bareName + "_")`, falling back to the contract's verified +`explicitBacking` prefixes (`nearestApproachDistance` <- the `nad_*` +family). + +The portable-aliases contract is read from the catalog's folded-in +`portableAliases` when an --idl is given and carries it; otherwise from +the vendored, byte-identical SoT copy tools/portable-aliases.json (the +vendored MEOS 1.4 `tools/meos-idl.json` predates the MEOS-API #8 fold-in, +so the gate uses the SoT copy by default and stays self-contained). + +Version bridge (now inert at MEOS 1.4, kept as a documented safety net): +pre-1.4 headers named the `<->` temporal-distance operator `distance_t*`; +MEOS 1.4 (vendored here via GoMEOS PR #3) uses the canonical `tdistance_*`, +so `tdistance` resolves directly by prefix and `BINDING_BACKING` no longer +fires. It is retained only so the gate still reports honestly if pointed +at an older header. + + python3 tools/portable_parity.py # write report + python3 tools/portable_parity.py --check # exit non-zero on any gap + +Writes tools/portable-parity.report.json. +""" + +from __future__ import annotations + +import argparse +import json +import re +import sys +from pathlib import Path + +REPO = Path(__file__).resolve().parent.parent +PREVIEW = REPO / "tools" / "_preview" +VENDORED = Path(__file__).resolve().parent / "portable-aliases.json" +REPORT = Path(__file__).resolve().parent / "portable-parity.report.json" + +# Full user-facing temporal type families — cbuffer/npoint/pose/rgeo are +# NOT internals and must never be excluded from the parity headline. +# Precedence keeps the broad geo/temporal buckets from swallowing them. +IN_SCOPE_FAMILIES = ["temporal", "geo", "cbuffer", "npoint", "pose", "rgeo"] + +# Inert at MEOS 1.4 (canonical `tdistance_*` is present); retained as a +# documented safety net for older/partial header scans only. +BINDING_BACKING = { + "tdistance": ["distance_tfloat", "distance_tint", + "distance_tnumber", "distance_tpoint"], +} + +# Every MEOS C symbol referenced through CGO: `C.(`. +_CGO_RE = re.compile(r"\bC\.([A-Za-z_]\w*)\s*\(") +_CGO_PSEUDO = {"CString", "CBytes", "GoString", "GoBytes", "free", + "malloc", "calloc"} + + +def _scan(path: Path, syms: set[str]) -> None: + for m in _CGO_RE.findall(path.read_text()): + if m not in _CGO_PSEUDO: + syms.add(m) + + +def exposed_symbols(repo: Path) -> list[str]: + """MEOS C function names the CGO layer references — hand-written root + package plus the IDL-driven generated surface (tools/_preview).""" + syms: set[str] = set() + for go in sorted(repo.glob("*.go")): + if not go.name.endswith("_test.go"): + _scan(go, syms) + if PREVIEW.is_dir(): + for go in sorted(PREVIEW.glob("*.go")): + if not go.name.endswith("_test.go"): + _scan(go, syms) + return sorted(syms) + + +def load_portable_aliases(idl_path: str | None) -> dict: + """Prefer the catalog's folded-in portableAliases; else the vendored SoT.""" + if idl_path: + idl = json.loads(Path(idl_path).read_text()) + pa = idl.get("portableAliases") + if pa and pa.get("families"): + return pa + return json.loads(VENDORED.read_text()) + + +def family_of(name: str) -> str: + n = name.lower() + if "rgeo" in n: + return "rgeo" + if "cbuffer" in n: + return "cbuffer" + if "npoint" in n: + return "npoint" + if "pose" in n: + return "pose" + if any(t in n for t in ("geo", "geom", "geog", "point", "spatial")): + return "geo" + return "temporal" + + +def build_parity(symbols: list[str], pa: dict) -> dict: + fam_of = {p["bareName"]: (fam, p["operator"]) + for fam, lst in pa["families"].items() for p in lst} + explicit = dict(pa.get("explicitBacking", {})) + + def matches(prefix: str) -> list[str]: + return [s for s in symbols + if s == prefix or s.startswith(prefix + "_")] + + fams_present = {family_of(s) for s in symbols} + + by_bare: dict[str, dict] = {} + fam_totals: dict[str, int] = {f: 0 for f in IN_SCOPE_FAMILIES} + for bare, (fam, op) in sorted(fam_of.items()): + hits, via = matches(bare), "prefix" + if not hits: + for pref in explicit.get(bare, []): + hits += matches(pref) + if hits: + via = "explicit:" + ",".join(explicit.get(bare, [])) + if not hits and bare in BINDING_BACKING: + for pref in BINDING_BACKING[bare]: + hits += matches(pref) + if hits: + via = "version-bridge:" + ",".join(BINDING_BACKING[bare]) + if not hits: + via = None + hist: dict[str, int] = {} + for h in hits: + k = family_of(h) + hist[k] = hist.get(k, 0) + 1 + fam_totals[k] = fam_totals.get(k, 0) + 1 + by_bare[bare] = { + "operator": op, "family": fam, "via": via, + "backedBy": len(hits), "sample": sorted(hits)[:3], + "familyCoverage": hist, + "status": "backed" if hits else "needs-explicit-backing", + } + + backed = [b for b, v in by_bare.items() if v["status"] == "backed"] + unbacked = sorted(b for b, v in by_bare.items() + if v["status"] == "needs-explicit-backing") + + # - covered : has backings now + # - regressed : header carries the type's symbols but zero backings + # (a real exclusion — hard fail; never tolerated) + # - pending : type absent from the scanned MEOS surface entirely + fam_status: dict[str, str] = {} + for f in IN_SCOPE_FAMILIES: + if fam_totals.get(f, 0) > 0: + fam_status[f] = "covered" + elif f in fams_present: + fam_status[f] = "regressed" + else: + fam_status[f] = "pending" + regressed = [f for f, s in fam_status.items() if s == "regressed"] + pending = [f for f, s in fam_status.items() if s == "pending"] + + total = len(by_bare) + return { + "exposedSymbols": len(symbols), + "symbolSources": ["repo-root *.go (hand-written)", + "tools/_preview/*.go (IDL-driven codegen " + "from tools/meos-idl.json — MEOS-API output)"], + "total": total, + "backed": len(backed), + "needsExplicitBacking": len(unbacked), + "parityPct": round(len(backed) * 100 / total, 1) if total else 0, + "unbacked": unbacked, + "familyCoverage": fam_totals, + "familyStatus": fam_status, + "regressedFamilies": regressed, + "pendingFamilies": pending, + "byBareName": by_bare, + "provenance": pa.get("provenance", {}), + "scope": pa.get("scope", {}), + } + + +def main() -> int: + ap = argparse.ArgumentParser(description=__doc__) + ap.add_argument("--idl", metavar="meos-idl.json", default=None, + help="catalog to read portableAliases from " + "(default: vendored tools/portable-aliases.json)") + ap.add_argument("--check", action="store_true", + help="exit non-zero if any bare name is unbacked or any " + "in-scope family present in the surface is excluded " + "(CI gate)") + args = ap.parse_args() + + symbols = exposed_symbols(REPO) + pa = load_portable_aliases(args.idl) + rep = build_parity(symbols, pa) + REPORT.write_text(json.dumps(rep, indent=2) + "\n") + + src = ("idl.portableAliases" if args.idl + and json.loads(Path(args.idl).read_text()) + .get("portableAliases", {}).get("families") + else "vendored tools/portable-aliases.json") + print(f"[portable-parity] {rep['backed']}/{rep['total']} bare names " + f"backed in the exposed GoMEOS CGO symbol set " + f"({rep['parityPct']}%); {rep['needsExplicitBacking']} unbacked " + f"[contract: {src}]", file=sys.stderr) + print(f"[portable-parity] six-family status {rep['familyStatus']} " + f"-> {REPORT}", file=sys.stderr) + for b in rep["unbacked"]: + v = rep["byBareName"][b] + print(f" needs-explicit-backing: {b!r} ({v['operator']}, " + f"{v['family']})", file=sys.stderr) + + # Hard gate = the handoff doc's literal "Done" for a binding: + # 29/29 bare names backed, 0 unbacked, and every in-scope user-facing + # family covered (cbuffer/npoint/pose/rgeo are never excluded). + fail = bool(rep["unbacked"] or rep["regressedFamilies"] + or rep["pendingFamilies"]) + if args.check: + if rep["regressedFamilies"]: + print(" EXCLUDED in-scope families present in surface but " + f"unbacked: {rep['regressedFamilies']}", file=sys.stderr) + if rep["pendingFamilies"]: + print(" in-scope families absent from the scanned surface: " + f"{rep['pendingFamilies']}", file=sys.stderr) + verdict = ("FAIL" if fail else + f"PASS — {rep['backed']}/{rep['total']} = 100%, " + "0 unbacked, all six in-scope families covered") + print(f"CHECK: {verdict}", file=sys.stderr) + return 1 if fail else 0 + return 0 + + +if __name__ == "__main__": + sys.exit(main())