Skip to content

fix: Disable semantic_check on populate antijoin (parallels #1383)#1453

Closed
dimitri-yatsenko wants to merge 1 commit into
fix/store-secrets-arbitrary-attrfrom
fix/populate-antijoin-semantic-check
Closed

fix: Disable semantic_check on populate antijoin (parallels #1383)#1453
dimitri-yatsenko wants to merge 1 commit into
fix/store-secrets-arbitrary-attrfrom
fix/populate-antijoin-semantic-check

Conversation

@dimitri-yatsenko
Copy link
Copy Markdown
Member

Stacked on #1452. Base is set to that PR's branch for now; rebase to master
once #1452 lands. The two PRs share no overlapping diff hunks — this PR only
touches autopopulate.py and tests/integration/test_autopopulate.py.

Summary

#1383 disabled semantic_check for the jobs table antijoin in Job.refresh().
The same defect lives in AutoPopulate._populate_direct's antijoin (and the
progress() fallback). This PR applies the analogous fix.

The - operator on a QueryExpression calls .restrict(Not(...)) with default
semantic_check=True. For a plain set-difference ("which rows of the LHS aren't
in the RHS?"), the semantic check is wrong — we don't care whether the same-named
PK attribute carries the same source-table lineage tag on each side; we just want
the set difference.

Where it bites

dj.Imported / dj.Computed tables whose PK is fully inherited from a single
foreign key, with no own-table PK attributes
:

@schema
class Spec(dj.Manual):
    definition = \"\"\"
    spec_id : int32
    ---
    label : varchar(30)
    \"\"\"

@schema
class Item(dj.Imported):
    definition = \"\"\"
    -> Spec
    ---
    payload : varchar(60)
    \"\"\"
    def make(self, key):
        self.insert1(dict(key, payload=...))

Spec.insert([(1, \"alpha\"), (2, \"beta\")])
Item.populate()   # ← DataJointError on master

On master:

DataJointError: Cannot join on attribute 'spec_id': different lineages
(test_schema.spec.spec_id vs None). Use .proj() to rename one of the attributes.

Item.proj() returns spec_id with lineage=None while the key_source's
spec_id carries Spec's lineage. The semantic check rejects the antijoin.

Why this didn't surface from #1405's test suite

The existing regression tests (test_populate_antijoin_with_secondary_attrs,
test_populate_antijoin_overlapping_attrs) all use tables that either have
multiple FK parents (so PK attributes come from multiple sources and
proj() lineage gets anchored) or extend the FK-inherited PK with own-table
attributes. Elements / SciOps pipelines almost always follow that shape.

The single-FK-inherited-PK + no-own-PK-attrs pattern is uncommon but
legitimate — "one row downstream per parent row, no intermediate ID".

Changes

src/datajoint/autopopulate.py

  • Import Not from .condition at module top.
  • _populate_direct: replace (LHS - self.proj()) with
    LHS.restrict(Not(self.proj()), semantic_check=False).
  • progress(): same swap on the no-common-attrs fallback branch.

tests/integration/test_autopopulate.py

  • New test_populate_antijoin_fk_inherited_pk regression test.
    Constructs the minimal shape (Spec(Manual) → Item(Imported)) that
    triggers the bug. Exercises partial populate, progress() counts,
    full populate, and confirms no re-processing.

Test plan

  • New test_populate_antijoin_fk_inherited_pk exercises the bug and the fix
  • Existing antijoin tests (test_populate_antijoin_with_secondary_attrs,
    test_populate_antijoin_overlapping_attrs) unchanged and should still pass
  • Pre-commit hooks (ruff, mypy, codespell) all pass
  • CI integration tests against MySQL + PostgreSQL (this PR can't run locally
    — testcontainers needs Docker which isn't running on the contributor box)

Same fix #1383 applied to the Job table's antijoin in refresh(),
now applied to AutoPopulate._populate_direct's antijoin and the
progress() fallback path. The two-arg subtract `key_source - self`
triggers QueryExpression.__sub__ which calls .restrict(Not(...))
with semantic_check=True by default.

The semantic-check requirement is wrong here: this antijoin is a
plain set-difference, not a join — we ask "which key_source rows
aren't yet in self." Whether the same-named PK attribute carries
the same source-table lineage tag on both sides is irrelevant.

Where it bites: dj.Imported / dj.Computed tables whose primary key
is fully inherited from a single FK, with no own-table PK attributes.
On those, self.proj() returns the PK attribute with lineage=None
(or pointing to self rather than the FK parent), while key_source's
matching attribute carries the parent's lineage tag. The
semantic-check fails with:

    Cannot join on attribute 'X': different lineages
    (schema.parent.X vs None). Use .proj() to rename one of the
    attributes.

This pattern is legitimate ("one row downstream per parent row,
no intermediate ID") but rare in typical Elements / SciOps pipelines,
which extend the inherited PK with own-table attributes (trial_id,
experiment_id, etc.) that anchor proj()'s lineage. That's why the
existing #1405 test suite didn't surface it.

Changes:
- src/datajoint/autopopulate.py
  - Import Not from .condition at module top.
  - _populate_direct: replace `(LHS - self.proj())` with
    `LHS.restrict(Not(self.proj()), semantic_check=False)`.
  - progress(): same swap on the no-common-attrs fallback branch.
- tests/integration/test_autopopulate.py
  - New test_populate_antijoin_fk_inherited_pk regression test:
    Spec(Manual) -> Item(Imported with only -> Spec) — the minimal
    shape that triggers the bug. Without the fix Item.populate()
    raises DataJointError; with the fix it populates correctly,
    progress() reports correct counts, and partial-then-full
    populate works.

Stacked on top of #1452 (the secrets-loading + dead-code fix); rebase
to master after that lands.
@dimitri-yatsenko dimitri-yatsenko marked this pull request as draft May 19, 2026 23:18
@dimitri-yatsenko
Copy link
Copy Markdown
Member Author

Closing — the diagnosis was wrong.

I assumed Image.proj() was returning the FK-inherited PK attribute with
lineage=None because of a proj()-time computation bug. It isn't.
After a clean schema redeclare on the demo's actual workspace:

=== schema.lineage (all attribute -> origin mappings) ===
  blob_detection.image_spec.image_id     →  blob_detection.image_spec.image_id
  blob_detection._image.image_id         →  blob_detection.image_spec.image_id
  blob_detection.#blob_param_set.blob_paramset → blob_detection.#blob_param_set.blob_paramset
  blob_detection.__detection.image_id    →  blob_detection.image_spec.image_id
  ...

=== Image.proj().heading["image_id"] ===
Attribute(name='image_id', ..., lineage='blob_detection.image_spec.image_id')

=== ImageSpec.proj().heading["image_id"] ===
Attribute(name='image_id', ..., lineage='blob_detection.image_spec.image_id')

Both sides of the populate antijoin carry the same lineage tag. The
semantic check is doing the right thing here; disabling it would silence
legitimate join-correctness errors (e.g., the class of bugs #1405 was
built to catch).

The original lineage=None error we hit yesterday in the demo must have
come from a stale ~lineage row — probably from a schema that was
declared before all the lineage-tracking fixes had landed and never
rebuilt. The correct user-facing remedy is schema.rebuild_lineage()
(already exposed), or just a clean redeclare; neither requires an SDK
change.

Apologies for the noise. #1452 is unaffected and still stands on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant