fix: Load any stores.<name>.<attr> secret file (not just access_key/secret_key)#1452
Open
dimitri-yatsenko wants to merge 1 commit into
Open
fix: Load any stores.<name>.<attr> secret file (not just access_key/secret_key)#1452dimitri-yatsenko wants to merge 1 commit into
dimitri-yatsenko wants to merge 1 commit into
Conversation
Two changes: 1. settings.py — Config._load_secrets() Was hardcoded to only auto-load stores.<name>.access_key and stores.<name>.secret_key from the secrets directory. That whitelist made sense when only the built-in s3 / gcs / azure stores existed, but it's wrong for plugin-registered adapters that define their own secret fields. Concrete example: a plugin that wraps Databricks Unity Catalog Volumes authenticates via a Bearer token. Its store spec wants a `token` field. Today the file .secrets/stores.uc.token is silently ignored, forcing users to either rename to a misleading "access_key" or to set the value programmatically at notebook startup (defeating the auto-load story). This change drops the whitelist. Any stores.<name>.<attr> file under the secrets directory is loaded into dj.config["stores"][<name>][<attr>]. The existing "don't override pre-configured values" precedence is preserved — config file and env vars still win. New test: TestStoreSecrets.test_load_store_arbitrary_attr confirms that `token` and `api_key` fields load via the same path as access_key/secret_key. 2. storage_adapter.py — _discover_adapters() Removed the try/except wrapping importlib.metadata.entry_points(). The fallback path was for Python < 3.10 (where entry_points() returned a SelectableGroups dict instead of accepting the group= kwarg), but DataJoint's minimum supported Python is 3.10 (per requires-python in pyproject.toml). The dead branch was also producing a mypy type error (SelectableGroups.get signature).
dimitri-yatsenko
added a commit
that referenced
this pull request
May 19, 2026
Same fix #1383 applied to the Job table's antijoin in refresh(), now applied to AutoPopulate._populate_direct's antijoin and the progress() fallback path. The two-arg subtract `key_source - self` triggers QueryExpression.__sub__ which calls .restrict(Not(...)) with semantic_check=True by default. The semantic-check requirement is wrong here: this antijoin is a plain set-difference, not a join — we ask "which key_source rows aren't yet in self." Whether the same-named PK attribute carries the same source-table lineage tag on both sides is irrelevant. Where it bites: dj.Imported / dj.Computed tables whose primary key is fully inherited from a single FK, with no own-table PK attributes. On those, self.proj() returns the PK attribute with lineage=None (or pointing to self rather than the FK parent), while key_source's matching attribute carries the parent's lineage tag. The semantic-check fails with: Cannot join on attribute 'X': different lineages (schema.parent.X vs None). Use .proj() to rename one of the attributes. This pattern is legitimate ("one row downstream per parent row, no intermediate ID") but rare in typical Elements / SciOps pipelines, which extend the inherited PK with own-table attributes (trial_id, experiment_id, etc.) that anchor proj()'s lineage. That's why the existing #1405 test suite didn't surface it. Changes: - src/datajoint/autopopulate.py - Import Not from .condition at module top. - _populate_direct: replace `(LHS - self.proj())` with `LHS.restrict(Not(self.proj()), semantic_check=False)`. - progress(): same swap on the no-common-attrs fallback branch. - tests/integration/test_autopopulate.py - New test_populate_antijoin_fk_inherited_pk regression test: Spec(Manual) -> Item(Imported with only -> Spec) — the minimal shape that triggers the bug. Without the fix Item.populate() raises DataJointError; with the fix it populates correctly, progress() reports correct counts, and partial-then-full populate works. Stacked on top of #1452 (the secrets-loading + dead-code fix); rebase to master after that lands.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Config._load_secrets()was hardcoded to only auto-loadstores.<name>.access_keyand
stores.<name>.secret_keyfrom the.secrets/directory. That whitelist madesense when only the built-in
s3/gcs/azurestores existed, but it's wrongfor plugin-registered adapters (the storage-adapter plugin system added in
#1432) that define their own secret fields.
Concrete motivation
A plugin that wraps Databricks Unity Catalog Volumes authenticates via a Bearer
token. Its store spec wants a
tokenfield — notaccess_key/secret_key.Today
.secrets/stores.uc.tokenis silently ignored, forcing users to either:access_key, orstory DataJoint advertises in the configure-storage docs)
This change drops the whitelist. Any
stores.<name>.<attr>file under the secretsdirectory loads into
dj.config["stores"][<name>][<attr>]. The existing precedenceis preserved — config file and env vars still win.
What changed
src/datajoint/settings.py— primary fixComment around the loop updated to reflect the broader applicability.
src/datajoint/storage_adapter.py— incidental cleanupRemoved a
try/except TypeErroraroundentry_points(group=...). The fallbackwas for Python < 3.10, where
entry_points()returned aSelectableGroupsdictinstead of accepting the
group=kwarg. DataJoint's minimum supported Python is3.10 (installation docs and
pyproject.toml'srequires-python), so the branch is dead code — and wasproducing a stale mypy error on
SelectableGroups.getsignature.Tests
New test
TestStoreSecrets.test_load_store_arbitrary_attrexercises bothtokenand
api_keyfields loading correctly. Existing tests(
test_load_store_credentials_from_secrets,test_secrets_do_not_override_existing)still pass — the precedence semantics are unchanged.
Test plan
TestStoreSecretstests still passtest_load_store_arbitrary_attrpassestests/unit/test_storage_adapter.pystill pass