ENH: Add libjpeg-turbo backend for GDCM JPEG codec#6149
ENH: Add libjpeg-turbo backend for GDCM JPEG codec#6149blowekamp wants to merge 1 commit intoInsightSoftwareConsortium:mainfrom
Conversation
Add JPEGTurboCodec that uses ITK's libjpeg-turbo instead of GDCM's vendored IJG 6b libraries. Controlled by GDCM_USE_JPEGTURBO (ON by default). When enabled, a single codec handles 8/12/16-bit JPEG via runtime precision dispatch, replacing gdcmjpeg8/12/16 libraries. The old IJG path is preserved as a fallback when the option is OFF.
|
@blowekamp I am supportive of this effort. Ideally, the upstream submission would be done first, and this side-step could be avoided. I'll monitor this PR and take your guidance on whether to wait for an upstream incorporation or move forward with this temporary incorporation. |
|
@greptile Review this draft. |
|
| Filename | Overview |
|---|---|
| Modules/ThirdParty/GDCM/src/gdcm/Source/MediaStorageAndFileFormat/gdcmJPEGTurboCodec.cxx | New 842-line codec; has two P1 bugs: fill_input_buffer returns FALSE on EOF instead of inserting fake EOI, and malloc'd tempbuffer in the planar-encode path leaks under JPEG error/longjmp. Also relies on private jpegint.h and incorrectly sets PlanarConfiguration=1 for YCbCr. |
| Modules/ThirdParty/GDCM/src/gdcm/Source/MediaStorageAndFileFormat/gdcmJPEGTurboCodec.h | Clean header declaring JPEGTurboCodec as a pimpl-based subclass of JPEGCodec; no issues found. |
| Modules/ThirdParty/GDCM/src/gdcm/Source/MediaStorageAndFileFormat/gdcmJPEGCodec.cxx | Dispatcher correctly selects JPEGTurboCodec for bit depths 1–16 under GDCM_USE_JPEGTURBO; guards and fallback path look correct. |
| Modules/ThirdParty/GDCM/src/CMakeLists.txt | Introduces GDCM_USE_JPEGTURBO as CACHE INTERNAL, which silently prevents user opt-out despite the PR describing an opt-out fallback. |
| Modules/ThirdParty/GDCM/src/gdcm/Source/MediaStorageAndFileFormat/CMakeLists.txt | Correctly conditionalises old IJG codec sources vs new TurboCodec source; include_directories and target_link_libraries wiring looks correct. |
| Modules/ThirdParty/GDCM/src/gdcm/CMakeLists.txt | Clears GDCM_LJPEG_LIBRARIES when turbo is active; properly slots into the existing LJPEG/system-LJPEG if/elseif/else chain. |
| Modules/ThirdParty/GDCM/src/gdcm/Utilities/CMakeLists.txt | One-line guard correctly skips building the IJG gdcmjpeg subdir when libjpeg-turbo is used. |
| Modules/ThirdParty/GDCM/itk-module.cmake | ITKJPEG added unconditionally to DEPENDS; correct for the default-ON turbo path, but will declare an unused dependency when GDCM_USE_JPEGTURBO is OFF. |
Comments Outside Diff (1)
-
Modules/ThirdParty/GDCM/src/CMakeLists.txt, line 22 (link)CACHE INTERNALprevents users from opting out of libjpeg-turboset(GDCM_USE_JPEGTURBO ON CACHE INTERNAL ...)hides the variable from the CMake GUI and prevents command-line override (-DGDCM_USE_JPEGTURBO=OFF). The PR description says the old IJG path is preserved as an opt-out fallback, butINTERNALmakes that fallback inaccessible. If opt-out is intended, useBOOLinstead:set(GDCM_USE_JPEGTURBO ON CACHE BOOL "Use ITK's libjpeg-turbo as the JPEG backend instead of GDCM's IJG 6b")
Reviews (1): Last reviewed commit: "ENH: Add libjpeg-turbo backend for GDCM ..." | Re-trigger Greptile
| return FALSE; // suspension | ||
|
|
There was a problem hiding this comment.
fill_input_buffer returns FALSE on stream-end instead of fake EOI
When the stream position equals the end position (end == pos), the function returns FALSE directly. In non-suspension libjpeg usage, fill_input_buffer returning FALSE signals suspension mode to the decompressor — it will not insert a synthetic end-of-image marker and the decoder enters an incorrect state. The block below that inserts a fake FF D9 EOI marker is never reached in this case.
The standard approach (matching libjpeg's example.c / the original gdcmJPEGBITSCodec.hxx) is to fall through to the gcount <= 0 branch which emits the fake EOI and returns TRUE. The early return FALSE must be removed.
| row_pointer[0] = tempbuffer; | ||
| int offset = image_height * image_width; | ||
| while (cinfo.next_scanline < cinfo.image_height) { | ||
| gdcm_assert(row_stride % 3 == 0); | ||
| JSAMPLE *ptempbuffer = tempbuffer; | ||
| JSAMPLE *red = image_buffer + cinfo.next_scanline * row_stride / 3; | ||
| JSAMPLE *green = red + offset; | ||
| JSAMPLE *blue = green + offset; | ||
| for (size_t i = 0; i < row_stride / 3; ++i) { | ||
| *ptempbuffer++ = *red++; | ||
| *ptempbuffer++ = *green++; | ||
| *ptempbuffer++ = *blue++; | ||
| } | ||
| jpeg_write_scanlines(&cinfo, row_pointer, 1); | ||
| } | ||
| free(tempbuffer); | ||
| } |
There was a problem hiding this comment.
tempbuffer leaked when JPEG error fires longjmp in planar path
tempbuffer is allocated via malloc inside the planar-configuration branch of InternalCode. The setjmp recovery block calls jpeg_destroy_compress and returns false, but never frees tempbuffer. If jpeg_write_scanlines triggers an error that fires longjmp, this allocation leaks.
Replace malloc/free with a libjpeg pool allocation so it is owned by the compress struct:
JSAMPLE *tempbuffer = (JSAMPLE *)(*cinfo.mem->alloc_small)(
(j_common_ptr)&cinfo, JPOOL_IMAGE, row_stride * sizeof(JSAMPLE));
// remove the free(tempbuffer) call below| std::streampos end = src->infile->seekg(0, std::ios::end).tellg(); | ||
| src->infile->seekg(pos, std::ios::beg); | ||
|
|
||
| if (end == pos) | ||
| return FALSE; // suspension | ||
|
|
||
| size_t toread = TURBO_INPUT_BUF_SIZE; | ||
| if ((end - pos) < (std::streamoff)TURBO_INPUT_BUF_SIZE) | ||
| toread = (size_t)(end - pos); | ||
|
|
||
| src->infile->read((char *)src->buffer, toread); | ||
| std::streamsize gcount = src->infile->gcount(); | ||
|
|
There was a problem hiding this comment.
seekg-based size probe in fill_input_buffer is expensive and breaks non-seekable streams
Seeking to end-of-stream on every buffer-fill call is expensive and silently fails for pipes, network streams, or std::istringstream-derived buffers that don't support seekg. The libjpeg convention (example.c, jdatasrc.c) is to unconditionally call read(BUFSIZE), then handle a short read via the gcount <= 0 path that already exists below.
| } else if (cinfo.jpeg_color_space == JCS_CMYK) { | ||
| gdcm_assert(cinfo.num_components == 4); | ||
| PI = PhotometricInterpretation::CMYK; |
There was a problem hiding this comment.
YCbCr path unconditionally sets
PlanarConfiguration = 1
Standard baseline JPEG stores YCbCr components interleaved (PlanarConfiguration = 0). Setting this->PlanarConfiguration = 1 tells GDCM the pixel data is planar-separated, which would cause callers to de-interleave pixel data incorrectly. The original gdcmJPEGBITSCodec.hxx does not set PlanarConfiguration to 1 for this case. Verify whether this is intentional and if not, remove or guard it.
| extern "C" { | ||
| #include "itk_jpeg.h" | ||
| #include <itkjpeg-turbo/jpegint.h> // for cinfo.master->lossless | ||
| } |
There was a problem hiding this comment.
Dependency on private libjpeg-turbo internal header
jpegint.h
#include <itkjpeg-turbo/jpegint.h> pulls in an internal, unsupported header to access cinfo->master->lossless. This field is not part of the public API and can be removed or renamed in future libjpeg-turbo releases without notice, silently breaking the build. A more stable alternative is to check cinfo.process == JPROC_LOSSLESS if available through public headers.
Replace GDCM's vendored IJG 6b JPEG libraries (
gdcmjpeg8/12/16) withITK's existing libjpeg-turbo, controlled by a
GDCM_USE_JPEGTURBOCMakeoption (default ON). The old IJG path is preserved as an opt-out fallback.
Motivation
GDCM currently vendors a patched IJG 6b JPEG library (~1998 vintage) built
as three separate static libraries (
gdcmjpeg8,gdcmjpeg12,gdcmjpeg16)for the three DICOM-required precisions. ITK already vendors libjpeg-turbo
3.0.x with native multi-precision support. Using a single modern library:
jpeg_enable_lossless()Changes
gdcmJPEGTurboCodec.h/.cxxgdcmJPEGCodec.cxxJPEGTurboCodecwhen turbo is enabledMediaStorageAndFileFormat/CMakeLists.txtgdcm/CMakeLists.txtGDCM_USE_JPEGTURBOclearsGDCM_LJPEG_LIBRARIESgdcm/Utilities/CMakeLists.txtgdcmjpegsubdir when turbo is usedGDCM/src/CMakeLists.txtITKJPEGinclude dirs and libraryGDCM/itk-module.cmakeITKJPEGtoDEPENDSImplementation notes
jpeg12_read_scanlines()and
jpeg16_read_scanlines()prefixed functions; the codec dispatches atruntime based on
cinfo.data_precision.jpegint.h'smaster->losslessfield (sametechnique as libtiff's
tif_jpeg.c), since libjpeg-turbo exposes no publicprocessfield.jpeg_enable_lossless()replaces GDCM's IJG-specificjpeg_simple_lossless().jpeg_source_mgr/jpeg_destination_mgrimplementationsbridge libjpeg-turbo's scanline API to GDCM's
std::istream/std::ostream.Future work
This is an initial in-tree hack — the intent is to upstream the
JPEGTurboCodecto GDCM proper so ITK's vendored copy can track upstreamwithout carrying a large patch. A follow-up PR to the GDCM repository is
planned.
Test results (local)
Built and tested against the
defaultpreset (Release, x86-64 Linux):All 18 GDCM JPEG tests pass, including:
itkGDCMImageReadWriteTest_JPEGBaseline1(8-bit lossy)itkGDCM_ComplianceTestRGB_losslessJPEG-RGB(lossless SOF3)itkGDCM_ComplianceTestRGB_lossyJPEG-YBR_FULL_422(YCbCr lossy)Confirmed
gdcmjpeg8target no longer built (ninja: error: unknown target 'gdcmjpeg8').AI assistance
This PR was generated by a GitHub Copilot agent (Claude Sonnet 4.6) with
human direction and review.
authoring, build error diagnosis, and test verification
patch), option naming, build preset selection, code review, and all commits
and ctest commands and iterated on compilation errors (incorrect include
path for
jpegint.h, invertedif/elsebranch in CMakeLists)This is an agent-driven initial effort. The code should be reviewed
carefully before merge, particularly the stream source/dest managers and
the multi-precision scanline dispatch paths.