Skip to content

[ENG-9827] - save collection-submission custom metadata to CedarMetadataRecord on create/update#11735

Open
Vlad0n20 wants to merge 5 commits into
CenterForOpenScience:feature/es2-consolidationfrom
Vlad0n20:feature/ENG-9827
Open

[ENG-9827] - save collection-submission custom metadata to CedarMetadataRecord on create/update#11735
Vlad0n20 wants to merge 5 commits into
CenterForOpenScience:feature/es2-consolidationfrom
Vlad0n20:feature/ENG-9827

Conversation

@Vlad0n20
Copy link
Copy Markdown
Contributor

@Vlad0n20 Vlad0n20 commented May 12, 2026

Ticket

Purpose

Changes

Side Effects

QE Notes

CE Notes

Documentation

Comment thread api/collections/serializers.py
Comment thread osf/models/collection_submission.py Outdated
if not (self.collection.provider_id and self.collection.provider.required_metadata_template):
return
template = self.collection.provider.required_metadata_template
metadata = {f: getattr(self, f) for f in CEDAR_METADATA_FIELDS if getattr(self, f, '')}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main thing this has to do is make a valid cedar record (is why calling validate_required_metadata seems relevant) -- to index for search, will at least need a "@context" to be parsable as json-ld (either build one on the fly from the template jsonschema or hard-code one based on the specific properties used for the existing collection fields)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I have a question: why don't we use the Cedar Metadata Editor on the FE and change the collection submission process so that the FE would create a Cedar Metadata Record by posting to the respective endpoints, instead of having the BE do the "sync"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the Cedar Metadata Editor on the FE would give cleaner separation of concerns, but the custom fields (status, volume, issue, etc.) are already part of the existing collection submission payload — splitting that into a separate cedar record creation step adds coordination complexity and risk of partial failure on the client side. The BE sync keeps the existing submission flow unchanged for all clients while ensuring the cedar record stays consistent with the submission fields.

@Vlad0n20 Vlad0n20 force-pushed the feature/ENG-9827 branch from aa625a1 to 598379a Compare May 28, 2026 18:31
@adlius adlius requested a review from aaxelb May 28, 2026 19:06
Copy link
Copy Markdown
Collaborator

@aaxelb aaxelb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i get the narrowed scope (no need to build cedar records from collection-specific fields that are getting removed) but am unclear what this PR is doing now -- besides adding a few tests, which is fine, and removing validation, which is not

Comment thread api/collections/serializers.py Outdated
Comment on lines +473 to +483
if waffle.switch_is_active(features.COLLECTION_SUBMISSION_WITH_CEDAR) and collection.provider_id:
try:
collection.provider.validate_required_metadata(guid.referent)
except ValidationError as e:
raise InvalidModelValueError(e.message)
try:
obj = collection.collect_object(guid.referent, creator, **validated_data)
except ValidationError as e:
raise InvalidModelValueError(e.message)
if waffle.switch_is_active(features.COLLECTION_SUBMISSION_WITH_CEDAR):
if collection.provider_id:
try:
collection.provider.validate_required_metadata(guid.referent)
except ValidationError as e:
raise InvalidModelValueError(e.message)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason validate_required_metadata makes more sense after collect_object than before?
(if not, could just leave this file untouched)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(on closer inspection, if moving this validate_required_metadata call anywhere, would make sense to move it into collect_object -- it starts by validating the collection-specific metadata that's getting replaced by a collection-required cedar record)

Comment thread osf/models/provider.py
)

try:
jsonschema_validate(record.metadata, self.required_metadata_template.template)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, you removed the validation -- this change would make sense if is_published=True implied that the jsonschema check had already been run (could be done with django model validation -- move this into a method on CedarMetadataRecord and call it from clean when is_published is True -- actually that'd be a solid improvement, since every cedar record should validate when published)

even if the frontend is responsible for building the record, backend should still validate (and it's a feature of cedar that it's so easy to validate a record against a given template, because jsonschema)

Comment thread osf/models/provider.py Outdated
related_name='required_by_providers',
)

def validate_required_metadata(self, obj):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was there a reason to move this method? there's at least a rough convention that regular methods go below dunder-methods (like __repr__ and __str__) and properties in a class definition -- may as well make the changes in-place

@adlius adlius requested a review from aaxelb May 29, 2026 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants