Skip to content

[ENG-10028] SHARE is not consistently indexing OSF content#11671

Merged
adlius merged 2 commits intoCenterForOpenScience:feature/pbs-26-6from
mkovalua:fix/ENG-10028-pbs-26-6
Apr 13, 2026
Merged

[ENG-10028] SHARE is not consistently indexing OSF content#11671
adlius merged 2 commits intoCenterForOpenScience:feature/pbs-26-6from
mkovalua:fix/ENG-10028-pbs-26-6

Conversation

@mkovalua
Copy link
Copy Markdown
Contributor

Ticket

Purpose

Content on the OSF is not consistently being SHARE indexed. Newly created content on the OSF does not seem to be consistently being automatically indexed in SHARE, some content seems to be indexed but other content is not. There does not seem to be any way to discern what content has been indexed vs not indexed . The inability of some content to be re-indexed in admin, this is causing significant issues for OSF users, their content is not Discoverable on the OSF.

Changes

Take code from the PR

#11631 to avoid so much merge conflicts to solve with newpbs-26-6 target branch

Side Effects

QE Notes

CE Notes

Documentation

Copy link
Copy Markdown
Collaborator

@adlius adlius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment and some questions.

Comment thread api/share/utils.py Outdated
recatalog(queryset, start_id, chunk_count, chunk_size)


def get_not_indexed_guids_for_resource_with_no_indexed_guid(resource_type: str, first_guid: bool = True):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of the argument first_guid could be a bit more descriptive. How about something like only_oldest_guid?

Comment thread osf/models/mixins.py
def mark_indexing_failed(self):
self.has_been_indexed = False
from addons.osfstorage.models import OsfStorageFile
if isinstance(self, OsfStorageFile):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doe we need special casing for OstStorageFile?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a recursion (another share update task is called) on save for files , such a way to avoid it

#11631 (comment)

image

Comment thread osf/models/mixins.py
Comment on lines +2570 to +2571
has_been_indexed = models.BooleanField(default=None, null=True, blank=True, db_index=True)
date_last_indexed = models.DateTimeField(null=True, blank=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean these two fields would only be populated for the objects that are indexed after this PR is released? What happen to the objects that were indexed before this PR is merged/released?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand old records will be reindexed too using management command with cloud help to not left None fields for the objects.

image

@mkovalua mkovalua requested a review from adlius April 8, 2026 18:56
@adlius adlius merged commit dcb6bd4 into CenterForOpenScience:feature/pbs-26-6 Apr 13, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants