Skip to content

ENH: Embed UTF-8 active-code-page manifest in Windows test executables (supersedes #4390)#6231

Merged
hjmjohnson merged 1 commit into
InsightSoftwareConsortium:mainfrom
hjmjohnson:windows-utf8-active-codepage
May 7, 2026
Merged

ENH: Embed UTF-8 active-code-page manifest in Windows test executables (supersedes #4390)#6231
hjmjohnson merged 1 commit into
InsightSoftwareConsortium:mainfrom
hjmjohnson:windows-utf8-active-codepage

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Embed a UTF-8 active-code-page manifest into ITK's Windows test executables so non-ASCII (umlaut, CJK, emoji) filenames work without changes to the C++ filename API. Mirrors SuperElastix/elastix#1401 by @codeling. Closes the on-Windows portion of #4388.

Supersedes #4390 (@Pfleiderer-Adrian) — that PR proposed rejecting non-ASCII filenames in the Python wrapper. Reviewers (@N-Dekker, @blowekamp, @dzenanz) preferred enabling UTF-8 properly rather than tightening the contract. @Pfleiderer-Adrian credited via Co-Authored-By:.

What this changes
  • CMake/Windows-utf8-codepage.manifest — 8-line XML declaring <activeCodePage>UTF-8</activeCodePage>.
  • CMake/ITKWindowsUtf8.cmakeitk_target_attach_windows_utf8_manifest(<target>) helper (no-op outside MSVC).
  • Auto-attach from CreateTestDriver, CreateGoogleTestDriver, and the standalone itkTestDriver build site. Examples / downstream module CMakeLists can call the helper explicitly for their own executables.
  • SetConsoleOutputCP(CP_UTF8) at test-driver entry so stdout/stderr render UTF-8 byte sequences correctly on the console (cosmetic; the manifest is what makes file-IO work).
  • itkUtf8FilenameGTest.cxx — round-trip a small file at a path containing UTF-8 characters (speci+U+00E4+l-fil+U+1F44D+.txt) to lock in the behavior.
What this does NOT change (Tier B follow-up)

This PR is the executable-layer fix only. SimpleITK / Python wrappers cannot benefit because python.exe is shipped by upstream Python and ITK cannot attach a manifest to it. To make non-ASCII filenames work for Python consumers on Windows, ITK must wire itk::i18n_open / i18n_fopen (already present at Modules/IO/ImageBase/include/itkInternationalizationIOHelpers.h, gated by ITK_SUPPORTS_WCHAR_T_FILENAME_CSTYLEIO, but unused outside its own unit test) through itkImageIOBase.cxx + itkPNGImageIO.cxx + itkJPEGImageIO.cxx.

That follow-up is captured locally as a deferred task and will be a separate PR co-authored with @Pfleiderer-Adrian.

Local validation
  • ninja ITKTestKernelGTestDriver itkTestDriver — clean build.
  • ./bin/ITKTestKernelGTestDriver --gtest_filter='WindowsUtf8Codepage.*'[ PASSED ] 1 test on Linux (validates byte-transparent round-trip; Windows behavior is the CI gate).
  • pre-commit run --all-files — clean.

Closes #4388 (Windows portion).

@github-actions github-actions Bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Enhancement Improvement of existing methods or implementation type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct area:Core Issues affecting the Core module labels May 7, 2026
Copy link
Copy Markdown
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good.

Comment thread Modules/Core/TestKernel/test/itkUtf8FilenameGTest.cxx Outdated
@hjmjohnson hjmjohnson force-pushed the windows-utf8-active-codepage branch from e685000 to 32bce76 Compare May 7, 2026 17:13
@hjmjohnson hjmjohnson marked this pull request as ready for review May 7, 2026 17:16
@greptile-apps

This comment was marked as resolved.

Comment thread Modules/Core/TestKernel/test/itkUtf8FilenameGTest.cxx Outdated
Comment thread Modules/Core/TestKernel/test/itkUtf8FilenameGTest.cxx Outdated
…ables

Mirrors the elastix recipe (SuperElastix/elastix#1401, by Bernhard
Fröhler) so ITK test executables on Windows treat narrow-char Win32
APIs (CreateFileA, fopen, std::ifstream(const char *), ...) as UTF-8
on Windows 10 1903 and later. Closes the on-Windows portion of issue
InsightSoftwareConsortium#4388 (non-ASCII / umlaut paths fail in image readers under non-UTF-8
active code pages).

This is the executable-layer fix only. SimpleITK / Python wrappers
(python.exe is shipped by upstream Python and cannot have an ITK
manifest attached) still need the ImageIOBase / PNG / JPEG library-
side conversion to wide paths via the existing-but-unused
itkInternationalizationIOHelpers; that follow-up is captured for a
later PR.

What's added
------------
- CMake/Windows-utf8-codepage.manifest — 8-line XML declaring
  <activeCodePage>UTF-8</activeCodePage>.
- CMake/ITKWindowsUtf8.cmake — itk_target_attach_windows_utf8_manifest()
  function (no-op outside MSVC).
- Auto-attach from CreateTestDriver, CreateGoogleTestDriver, and the
  standalone itkTestDriver. Examples or downstream module CMakeLists
  can call the helper explicitly for their own executables.
- SetConsoleOutputCP(CP_UTF8) at test driver entry so stdout / stderr
  also render UTF-8 byte sequences correctly on the console (cosmetic;
  the manifest is what makes file-IO work).
- itkUtf8FilenameGTest.cxx — round-trip a small file at a path
  containing non-ASCII (UTF-8) characters to lock in the behavior.

Supersedes InsightSoftwareConsortium#4390 (Pfleiderer-Adrian) — that PR proposed rejecting
non-ASCII filenames in the Python wrapper, which the reviewers
(N-Dekker, blowekamp, dzenanz) preferred to address by enabling
UTF-8 properly rather than tightening the contract.

Co-Authored-By: Adrian Pfleiderer <42115394+Pfleiderer-Adrian@users.noreply.github.com>
@hjmjohnson hjmjohnson force-pushed the windows-utf8-active-codepage branch from 32bce76 to 4eca583 Compare May 7, 2026 17:26
@hjmjohnson hjmjohnson merged commit 660e0e1 into InsightSoftwareConsortium:main May 7, 2026
19 checks passed
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 12, 2026
…dows-utf8-active-codepage

ENH: Embed UTF-8 active-code-page manifest in Windows test executables (supersedes InsightSoftwareConsortium#4390)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Core Issues affecting the Core module type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to load images with Umlauts in filepath (ITK / SimpleITK)

2 participants