Skip to content

[io] Several fixes to handling of gaps and recovery#22481

Open
jblomer wants to merge 11 commits into
root-project:masterfrom
jblomer:tfile-fix-gaps-v2
Open

[io] Several fixes to handling of gaps and recovery#22481
jblomer wants to merge 11 commits into
root-project:masterfrom
jblomer:tfile-fix-gaps-v2

Conversation

@jblomer

@jblomer jblomer commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Fixes

  • Free segment recovery
  • Free segment header writing for large gaps (>2GB)

Replaces #19251
Fixes #19245

@jblomer jblomer self-assigned this Jun 4, 2026
@jblomer jblomer requested review from bellenot and pcanal as code owners June 4, 2026 13:44
@jblomer jblomer added the in:I/O label Jun 4, 2026
@jblomer jblomer changed the title [io] several fixes to handling of gaps and recovery [io] Several fixes to handling of gaps and recovery Jun 4, 2026
@jblomer jblomer force-pushed the tfile-fix-gaps-v2 branch 2 times, most recently from 84edc84 to b13a691 Compare June 4, 2026 14:06
Comment thread io/io/test/TFileTests.cxx Outdated
Comment thread io/io/test/TFileTests.cxx Outdated
Comment thread io/io/test/TFileTests.cxx
Comment thread io/io/src/TFile.cxx
Comment thread io/io/src/TFile.cxx
}
if (fWritable) {
const Long64_t last = idcur - nbytes - 1;
if (fFree->Last() && static_cast<TFree *>(fFree->Last())->GetLast() + 1 == idcur) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like rather than equality a greater than would be more 'stable', isn't it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer the strict equality because we construct it this way in the loop before. I.e., we want to pass the condition exactly when we added a free segment in the last loop iteration and in this case we constructed idcur to point just after the last free byte.

Comment thread io/io/src/TFile.cxx
if (nrecov) Write();
Long64_t max_file_size = Long64_t(kStartBigFile);
if (max_file_size < fEND)
max_file_size = fEND + 1000000000;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the semantic of 1000000000? Should we either add a comment and/or use a const to express it?

@jblomer jblomer Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out what the semantics of this value is. This constant was there before and it is sprinkled here and there. I guess it was meant to be "large enough gap so that we can fit any possible key" but mistakenly using 1GB instead of 1GiB. That's just my guess though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historical review.

When supporting only small files, max_file_size was 2GB and a node in the free list was added from fEND to 2GB.

Then max_file_size was updated for (only) 64 bits platforms to 500 times 2GB with the correspond node in the free list (from fEND to 500*2GB)

Then this was 'simplifiedtomax_file_sizebeingkStartBigFileorkStartBigFIle + 1GBiffEND > kStartBigFile`.

So I guess you are right and we ought to 'fix' it to be 1GiB.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Test Results

    22 files      22 suites   3d 12h 25m 41s ⏱️
 3 855 tests  3 854 ✅ 0 💤  1 ❌
77 075 runs  77 055 ✅ 0 💤 20 ❌

For more details on these failures, see this check.

Results for commit 2f9f3a1.

♻️ This comment has been updated with latest results.

jblomer added 6 commits June 5, 2026 13:35
The free segment header for gaps >2GB was previously truncated, leaving
the chain of segments broken. Now we will link together several smaller
gaps on disk to keep the chain intact. The free list, however, will
still contain one large gap.
In TFile::Recover(), incorrect free segments were added during recovery.
The file offset counter should be incremented _after_ adding the free
segment, as it is done now.

Also, remove an unnecessary `Seek()`, which will take place at the next
loop iteration anyway.
@jblomer jblomer added the clean build Ask CI to do non-incremental build on PR label Jun 5, 2026
jblomer added 5 commits June 5, 2026 13:53
Do not use an existing free list during file recovery. It will result in
stale and/or duplicated entries in the recovered free list. Instead,
only use the information picked up from the segment headers.
@jblomer jblomer force-pushed the tfile-fix-gaps-v2 branch from b13a691 to 2f9f3a1 Compare June 5, 2026 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean build Ask CI to do non-incremental build on PR in:I/O

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TFile issues when coalescing gaps

2 participants