Reduce seeks on file io#114
Conversation
|
Hi @rvause, yes libdicom assumes seek is (almost) free, I didn't realise it had such a significant cost on kubernetes. Thanks for the PR, I'll have a read. |
|
This is great @rvause, I feel embarrassed I missed such a large performance hole. I see a nice speedup on USB drives too. And your PR is very clean. I usually try to avoid realloc, since some platforms have very poor implementations, but it's only called a few 10s of 1000s of times on my test images, which should be ok. The only thing to add would be a line for the changelog, and a |
|
Thanks for taking the time to review this. I've added a changelog entry. Feel free to adjust the changelog entry if it could be better described. |
|
That looks perfect! I'll merge. You'll be pleased to hear that a new openslide release is coming up. I'll make a libdicom 1.3 with this change and some others, and it might be in users' hands in a just a few weeks. Thanks again for this nice work. |
|
Good to hear, will look out for that. I've been building against versions from main for both of these projects for a few months. The DICOM concatenation support has been tested at scale and is working well. |
|
Hi, am curious about the kubernetes aspect of the issue. Can you share what was the technology of the backing storage of the volume that the file was using? I appreciate some systems try to hide this away from you; but it may help others avoid using that particular filestore until the vendor of the volume driver fixes their horrible seek issue 😉 |
I was noticing very slow read times of dicom file regions using openslide. This issue was somewhat hidden until the code was running in an environment with a virtual filesystem (kubernetes managed containers). In such cases every seek and read incurs a noticeable cost.
Tracked the issue down to libdicom's file io behaviour. Using helper tool I am including here in the description it is apparent that the io layer is making many more
lseekcalls than necessary.Made two changes that work together to reduce the number of calls being made. In
dcm_io_seek_fileuse an absolute target and handle the cases where no seek is needed when target lands within the buffered window. Updateddcm_parse_encapsulated_frameto a single pass, usingdcm_reallocto grow the allocated buffer.Inspecting across reading the first 50 frames in y-x arrangement with
strace -e trace=lseek -f builddir/dcm-visit -n 50 file.dcm | grep 'lseek(' | wc -lresults in3428calls.Adding the
-rflag to read in x-y arrangement, this results in672386calls for this particular file.With the patch included here, both variants are uniform in
99calls for reading these 50 frames.As a way to demonstrate the overall improvement using
vips tiffsavewill read the entire full resolution dicom image. Run on a linux vm using virtiofs with 8 cores.Before:
After:
dcm-visit.c