Skip to content

[P1] ensure_procedure and ensure_deployment silently lose all SensorML metadata (keywords, identifiers, classifiers, …) — POSTs use wrong content-type and payload shape #5

@Sam-Bolling

Description

@Sam-Bolling

Summary

publishers/bootstrap_helpers.py:ensure_procedure (line 246) and
publishers/bootstrap_helpers.py:ensure_deployment (line 348) POST
their bodies with the default Content-Type: application/json from
api_post (line 132). The OS4CSAPI fleet's bootstrap-time payloads
for procedures and deployments wrap SensorML metadata
(keywords, identifiers, classifiers, characteristics,
capabilities, contacts, documentation, history,
securityConstraints, legalConstraints) inside a GeoJSON Feature
under properties
. The CSAPI server (per CSAPI Part 1, OGC 23-001)
treats application/json on these POST endpoints as
application/geo+json — i.e. the spatial-discovery view that
deliberately carries only uid / name / description (+ geometry).
SensorML metadata fields are silently dropped server-side.

ensure_system (line 272) does NOT have this bug: it follows the
spec-correct 2-step pattern — POST a small geo+json stub, then PUT
the full SensorML body with Content-Type: application/sml+json
(see lines 287, 305-311). That's why systems data is preserved while
procedure and deployment data is not.

Severity — P1 (silent data loss across the OS4CSAPI 10-publisher fleet)

This is not a transient or visible failure. Bootstraps return
HTTP 201; logs say [OK] Created procedure ...; the publisher
moves on. But the persisted row has NULL/empty for every
SensorML metadata column.

This bug has been latent since the fleet started running. It was
exposed (not introduced) by upstream connected-systems-go
commit
a467aba
("Adding Strict Parsing"), which now returns
HTTP 400 {"error":"unknown field 'keywords' in properties"}
instead of accepting + dropping. Upstream is correct; we are not
filing against connected-systems-go.

Production database audit (2026-05-05)

Run against connected-systems-go-db-1 on the OS4CSAPI Oracle VM
(currently pointed at pre-strict server build c9747af):

Resource total keywords identifiers classifiers contacts characteristics capabilities documentation history security_* legal_*
procedures 12 0 0 0 0 0 0 0 0 0 0
systems 38 34 35 35 35 n/a n/a 0 0 0 0
deployments 62 0 0 0 0 0 0 0 0 0 0

Procedures and deployments: 100% loss across all 10 SensorML
metadata fields, all 12 procedures and all 62 deployments.

Systems: ~89% preservation, confirming ensure_system's 2-step
pattern works.

Roundtrip evidence (2026-05-05)

Three live POST/GET pairs against the OS4CSAPI sandbox:

# Server Content-Type Payload shape POST keywords round-trip?
1 pre-strict (c9747af) application/json GeoJSON Feature, fields under properties 201 NO — silent drop
2 pre-strict (c9747af) application/sml+json SensorML top-level 201 YES
3 strict (df6da0d, upstream) application/json GeoJSON Feature, fields under properties 400 n/a (correctly rejected)

Test 1 is the shape ensure_procedure and ensure_deployment
emit today. Test 2 is the shape that round-trips cleanly.

Recommended fix

Refactor ensure_procedure and ensure_deployment to follow the
same 2-step pattern as ensure_system:

  1. POST a small geo+json stub (uid, name, description, optional
    geometry) with Content-Type: application/json.
  2. PUT the full SensorML body with
    Content-Type: application/sml+json against the just-created
    resource path. (Server side: confirm CSAPI Part 1 PUT semantics
    for procedures/{id} and deployments/{id} accept SensorML —
    they do per spec.)

Alternative single-step (cleaner, if the server supports it):
single POST with Content-Type: application/sml+json and
SensorML-top-level body, mirroring api_put's default content-type
on line 164. Verify which form the strict upstream accepts; both
should work per spec.

Proposed call-site changes

  • publishers/bootstrap_helpers.py:246 ensure_procedure
    add optional sml_body: dict | None = None parameter; if
    provided, follow the post-stub-then-PUT-SML pattern of
    ensure_system (lines 305-311).
  • publishers/bootstrap_helpers.py:348 ensure_deployment
    same change.
  • All 10 publisher bootstrap_*.py files — split current
    procedure/deployment payload dicts into geo+json stub +
    SensorML body, pass both to the updated helpers.

Verification plan

  1. Test against https://129-80-248-53.sslip.io/csapi-go-upstream/
    (strict-decoder, df6da0d) which currently returns 400 on the
    broken shape. With the fix, expect 201 + GET round-trips all 9
    SensorML fields.
  2. Add a roundtrip integration test (tests/test_bootstrap_roundtrip.py)
    that POSTs a procedure with keywords and asserts GET returns
    keywords populated.
  3. After fix lands, wipe production DB and re-bootstrap fleet to
    recover the 12 procedure rows + 62 deployment rows of lost
    metadata.

Out of scope (per directive 2026-05-05)

  • No changes to Botts-Innovative-Research/OSHConnect-Python upstream.
  • No filing on SomethingCreativeStudios/connected-systems-go
    (upstream is correct).
  • No schema/wrapper changes on the Go server.

Cross-references

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions