ZCU-DATA/fix: prevent orphaned items — abort leaked request Context (backport dc57ffded8/#852 to zcu-data)#1352
Merged
milanmajchrak merged 1 commit intoJul 2, 2026
Conversation
…ackport dc57ffd / #852) Backport the behavioral fix from dc57ffd ("Transaction bug - close context in finally block", PR #852) which is on dtq-dev/customer/lindat but missing from customer/zcu-data. On the request error path a slow submission upload could leave its Hibernate session bound to the Tomcat thread with a dirty, stale Item: the upload throws before reaching context.commit(), and DSpaceRequestContextFilter assigns its `context` local only AFTER chain.doFilter, so on an exception the local stays null and its finally skips the abort -> the session leaks. A later request on that thread then flushes the stale Item as a full-row UPDATE (Item has no @DynamicUpdate/@Version), reverting owning_collection->NULL and in_archive->false while collection2item + handle survive -> the item becomes an unsearchable orphan (e.g. handle 20.500.14592/107). Wrap chain.doFilter in the outer StatelessAuthenticationFilter with a try/finally that reads the Context from the request attribute and abort()s it if still valid, cleaning up the leaked context the inner filter missed. No-op on the normal success path. Only the StatelessAuthenticationFilter hunk is ported; the diagnostic/CI files in dc57ffd are omitted (they conflict and are not needed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
milanmajchrak
added a commit
that referenced
this pull request
Jul 3, 2026
…ackport #1352 to mendelu) (#1357) * fix: abort leaked request Context in StatelessAuthenticationFilter (backport dc57ffd / #852) Backport the behavioral fix from dc57ffd ("Transaction bug - close context in finally block", PR #852) which is on dtq-dev/customer/lindat but missing from customer/zcu-data. On the request error path a slow submission upload could leave its Hibernate session bound to the Tomcat thread with a dirty, stale Item: the upload throws before reaching context.commit(), and DSpaceRequestContextFilter assigns its `context` local only AFTER chain.doFilter, so on an exception the local stays null and its finally skips the abort -> the session leaks. A later request on that thread then flushes the stale Item as a full-row UPDATE (Item has no @DynamicUpdate/@Version), reverting owning_collection->NULL and in_archive->false while collection2item + handle survive -> the item becomes an unsearchable orphan (e.g. handle 20.500.14592/107). Wrap chain.doFilter in the outer StatelessAuthenticationFilter with a try/finally that reads the Context from the request attribute and abort()s it if still valid, cleaning up the leaked context the inner filter missed. No-op on the normal success path. Only the StatelessAuthenticationFilter hunk is ported; the diagnostic/CI files in dc57ffd are omitted (they conflict and are not needed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit ea6116e) * docs: clarify Context-abort comment and link #1353 (Copilot review) --------- Co-authored-by: milanmajchrak <milan.majchrak@dataquest.sk> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
milanmajchrak
added a commit
that referenced
this pull request
Jul 3, 2026
…ort #1352 to sav) (#1356) * fix: abort leaked request Context in StatelessAuthenticationFilter (backport dc57ffd / #852) Backport the behavioral fix from dc57ffd ("Transaction bug - close context in finally block", PR #852) which is on dtq-dev/customer/lindat but missing from customer/zcu-data. On the request error path a slow submission upload could leave its Hibernate session bound to the Tomcat thread with a dirty, stale Item: the upload throws before reaching context.commit(), and DSpaceRequestContextFilter assigns its `context` local only AFTER chain.doFilter, so on an exception the local stays null and its finally skips the abort -> the session leaks. A later request on that thread then flushes the stale Item as a full-row UPDATE (Item has no @DynamicUpdate/@Version), reverting owning_collection->NULL and in_archive->false while collection2item + handle survive -> the item becomes an unsearchable orphan (e.g. handle 20.500.14592/107). Wrap chain.doFilter in the outer StatelessAuthenticationFilter with a try/finally that reads the Context from the request attribute and abort()s it if still valid, cleaning up the leaked context the inner filter missed. No-op on the normal success path. Only the StatelessAuthenticationFilter hunk is ported; the diagnostic/CI files in dc57ffd are omitted (they conflict and are not needed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit ea6116e) * docs: clarify Context-abort comment and link #1353 (Copilot review) --------- Co-authored-by: milanmajchrak <milan.majchrak@dataquest.sk> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
milanmajchrak
added a commit
that referenced
this pull request
Jul 3, 2026
…ackport #1352 to zcu-pub) (#1355) * fix: abort leaked request Context in StatelessAuthenticationFilter (backport dc57ffd / #852) Backport the behavioral fix from dc57ffd ("Transaction bug - close context in finally block", PR #852) which is on dtq-dev/customer/lindat but missing from customer/zcu-data. On the request error path a slow submission upload could leave its Hibernate session bound to the Tomcat thread with a dirty, stale Item: the upload throws before reaching context.commit(), and DSpaceRequestContextFilter assigns its `context` local only AFTER chain.doFilter, so on an exception the local stays null and its finally skips the abort -> the session leaks. A later request on that thread then flushes the stale Item as a full-row UPDATE (Item has no @DynamicUpdate/@Version), reverting owning_collection->NULL and in_archive->false while collection2item + handle survive -> the item becomes an unsearchable orphan (e.g. handle 20.500.14592/107). Wrap chain.doFilter in the outer StatelessAuthenticationFilter with a try/finally that reads the Context from the request attribute and abort()s it if still valid, cleaning up the leaked context the inner filter missed. No-op on the normal success path. Only the StatelessAuthenticationFilter hunk is ported; the diagnostic/CI files in dc57ffd are omitted (they conflict and are not needed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit ea6116e) * docs: clarify Context-abort comment and link #1353 (Copilot review) --------- Co-authored-by: milanmajchrak <milan.majchrak@dataquest.sk> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
milanmajchrak
added a commit
that referenced
this pull request
Jul 3, 2026
…ort #1352 to tul) (#1354) * fix: abort leaked request Context in StatelessAuthenticationFilter (backport dc57ffd / #852) Backport the behavioral fix from dc57ffd ("Transaction bug - close context in finally block", PR #852) which is on dtq-dev/customer/lindat but missing from customer/zcu-data. On the request error path a slow submission upload could leave its Hibernate session bound to the Tomcat thread with a dirty, stale Item: the upload throws before reaching context.commit(), and DSpaceRequestContextFilter assigns its `context` local only AFTER chain.doFilter, so on an exception the local stays null and its finally skips the abort -> the session leaks. A later request on that thread then flushes the stale Item as a full-row UPDATE (Item has no @DynamicUpdate/@Version), reverting owning_collection->NULL and in_archive->false while collection2item + handle survive -> the item becomes an unsearchable orphan (e.g. handle 20.500.14592/107). Wrap chain.doFilter in the outer StatelessAuthenticationFilter with a try/finally that reads the Context from the request attribute and abort()s it if still valid, cleaning up the leaked context the inner filter missed. No-op on the normal success path. Only the StatelessAuthenticationFilter hunk is ported; the diagnostic/CI files in dc57ffd are omitted (they conflict and are not needed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit ea6116e) * docs: clarify Context-abort comment and link #1353 (Copilot review) --------- Co-authored-by: milanmajchrak <milan.majchrak@dataquest.sk> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem description
On
customer/zcu-dataa submitted item can silently end up orphaned:owning_collection = NULLandin_archive = falsewhile itscollection2itemmapping and handle survive. The item then vanishes from search and collection listings, has no breadcrumb, and "Edit Item" throws HTTP 500 (NPE inCanManageMappingsFeature, see #1348). Real incident: item62133a8a-fbd1-474f-9b55-fa92d318aa03(handle20.500.14592/107).Root cause — a Hibernate lost update via a leaked, thread-bound session on the request error path:
WorkspaceItemRestRepository.upload→wis.findloads the Item, then a longuploadFileToInprogressSubmissionstream, thencontext.commit()at the end) holds the Item entity in its Hibernate L1 cache with the pre-deposit values (owning_collection = null,in_archive = false).owning_collection,in_archive = true) and deletes the workspace item the upload is working on.ClientAbortException: Broken pipe) before reachingcontext.commit().DSpaceRequestContextFilterassigns itscontextlocal afterchain.doFilter(...)inside thetry, so on an exception that assignment is skipped,contextstaysnull, and itsfinallyskips the abort — leaving the Hibernate session bound to the Tomcat worker thread (ThreadLocal session), still holding the dirty, stale Item.UPDATE(Item has no@DynamicUpdate/@Version), revertingowning_collection → NULLandin_archive → false. The separatecollection2itemjoin-table row and the handle survive — exactly the observed corruption.Analysis
The behavioral fix already exists on
dtq-devandcustomer/lindatas commitdc57ffded8(PR #852, "Transaction bug - close context in finally block") but is absent fromcustomer/zcu-data(which forked before it, 2024‑06‑26). This matches the reproduction exactly: the race reproduces on a ZCU‑DATA instance (fix absent) and does not reproduce on a LINDAT instance (fix present).Verified by elimination on the exact branch code:
WorkspaceItemRestRepository.upload(incl.context.commit()) andDSpaceRequestContextFilter(the null-local leak) are byte-identical betweencustomer/zcu-dataandcustomer/lindat— so they are not the differentiator.StatelessAuthenticationFilterchange fromdc57ffded8.git merge-base --is-ancestor dc57ffded8: PRESENT incustomer/lindat+dtq-dev, ABSENT incustomer/zcu-data.This PR backports only the load-bearing hunk of
dc57ffded8: the outerStatelessAuthenticationFilternow wrapschain.doFilter(...)intry/finallyand, in thefinally, reads the Context from the request attribute (not a possibly-null local) andabort()s it if still valid. Because this outer filter runs after the innerDSpaceRequestContextFilter, it cleans up the leaked context that the inner filter missed, so no later request can flush the stale Item. On the normal success path it is a harmless no-op (the request already committed, so the context is no longer valid).The other files touched by
dc57ffded8(Context.java,Utils.java,HibernateDBConnection.java,log4j2.xml,build.yml) are diagnostics/CI and are intentionally not included (they conflict with zcu-data's diverged files and are not needed for the fix).Problems
dc57ffded8does not cherry-pick cleanly ontocustomer/zcu-data(conflicts in the diagnosticContext.java/Utils.java); this PR therefore ports only the behavioralStatelessAuthenticationFilterhunk (identical to the one shipped indc57ffded8).Defense-in-depth follow-ups (not in this PR, optional): fix the latent null-local bug in
DSpaceRequestContextFilter(read the Context insidefinally), and add@DynamicUpdate/@VersiontoItem/DSpaceObjectfor a deterministic optimistic-lock guard. The read-path guard for the 500 (CanManageMappings) is #1348.Manual Testing (if applicable)
Must confirm on a ZCU‑DATA test instance: run the repro (start a large-file submission upload, trigger the deposit concurrently). Before this PR: the item ends with
owning_collection = NULL/in_archive = false. After this PR: the item stays correct (matches LINDAT behaviour).Copilot review