Skip to content

ENH: Ingest ITKPrincipalComponentsAnalysis into Modules/Numerics#6240

Merged
hjmjohnson merged 71 commits intoInsightSoftwareConsortium:mainfrom
hjmjohnson:ingest-PrincipalComponentsAnalysis
May 9, 2026
Merged

ENH: Ingest ITKPrincipalComponentsAnalysis into Modules/Numerics#6240
hjmjohnson merged 71 commits intoInsightSoftwareConsortium:mainfrom
hjmjohnson:ingest-PrincipalComponentsAnalysis

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Ingest the standalone remote module
InsightSoftwareConsortium/ITKPrincipalComponentsAnalysis
into Modules/Numerics/PrincipalComponentsAnalysis/. PCA on scalar,
vector, and mesh-vertex data is now in-tree under
EXCLUDE_FROM_DEFAULT.

Built directly on upstream/main (post-#6236, so pixi 0.68 line
continuations apply); the v4 ingestion guidelines defined in #6204
(sanitize-history pass, deny-pattern pass, whitelist-only paths,
ghostflow-clean per-commit subjects) were applied locally to produce
this PR's content. Closes the consolidation work of issue #6060 for
this module.

Ingest results
Aspect Value
Whitelist Utilities/Maintenance/RemoteModuleIngest/whitelists/PrincipalComponentsAnalysis.list
Final tree merge count 16 (incl. ingest merge)
Whitelisted files 50 (include/, test/, wrapping/, CMakeLists.txt, itk-module.cmake, LICENSE)
Content-links at HEAD All .cid (upstream already used ExternalData), 0 .md5, 0 .shaNNN, 0 raw binaries
examples/ and doc/ directories Auto-excluded; remain in archived upstream
License Apache-2.0
git blame Walks back to original upstream contributors (Marc Bowers, Laurent Younes, and others)
Local validation
  • pre-commit run --all-files: clean
  • cmake -DModule_PrincipalComponentsAnalysis:BOOL=ON .: configures clean
  • Compile check: new module's test source files build clean
    (PrincipalComponentsAnalysisHeaderTest1.cxx.o,
    itkVectorKernelPCATest.cxx.o).
  • Test execution will fetch the existing .cid blobs through the
    standard ITKTestingData mirror at CI time.
Commits at PR tip (top → bottom)
  1. ENH: Enable Module_PrincipalComponentsAnalysis in configure-ci
  2. COMP: Remove PrincipalComponentsAnalysis .remote.cmake (in-tree)
  3. DOC: Add PrincipalComponentsAnalysis README pointing at upstream
  4. ENH: Ingest ITKPrincipalComponentsAnalysis into Modules/Numerics
    (merge commit --no-ff --allow-unrelated-histories to
    sanitized filter-repo'd upstream history; preserves 15 upstream
    merges and git blame to original authors)
  5. ENH: Add v4 whitelist for PrincipalComponentsAnalysis ingest
Phase B (post-merge) — upstream archival

After this PR merges, run Phase B from a fresh worktree:

Utilities/Maintenance/RemoteModuleIngest/archive-remote-module.sh \
    PrincipalComponentsAnalysis

Refs: #6060

Johan Andruejol and others added 30 commits May 24, 2016 11:04
This is the initial commit of ITKPCA. The code is taken as is from the
insight journal article http://hdl.handle.net/10380/3386.
Credits goes to original authors Bowers M., Younes L..
This is the first part of a series of commit to convert this repository to
work with ITK's external modules infrastructure.
This commit allows to configure/generate the module against an ITK build.
The itkVectorFieldPCA.* files were moved to the folder named include which
the external module expects for header files.
The itk-module.cmake was added to describe the module and its dependencies.
The external module needs its testing to be under the test folder. The
itkVectorKernelPCATest was moved accordingly.
This commit does not build. But since we moved a file from one folder to
another, we will make the appropriate change in the next commit so they
stand out as actual change inside the file.
This fixes the KWStyle test.
This fixes the documentation test.
Also remove unused data.
While moving to the new external module architecture, these options were
removed to facilitate the transition. Now that is is done, these options
are re-added.
Legacy code was including the ${ITK_USE_FILE} which made the include of the
IO factories as dependencies mandatory to be able to link. By removing that
include, we can remove unnecessary dependencies.
This will prevent accidentaly setting those variables. For example, ITK has
an option BUILD_EXAMPLE that was setting BUILD_EXAMPLE to ON in this
repository and breaking compilation.
Also make sure when building the examples that ITK is found.
ITKMesh, ITKIOMesh, and ITKIOImageBase modules are required to build
VectorKernelPCA.
BUG: Example (VectorKernelPCA) was missing dependencies
Add the 'Module_' prefix to the module's flags so that they can be grouped
together with the module itself by the CMake GUI.

Change the 'Examples' and 'Documentation' folder names to 'examples' and
'doc' respectively.

Make the module's example and documentation build flags dependent on
ITK's BUILD_EXAMPLES and BUILD_DOCUMENTATION flags: if the latter are
set to OFF, the module's options will remain hidden and defaulted to OFF.
Use the ITK_DISALLOW_COPY_AND_ASSIGN macro to save typing the copy and
assignment method disallow statements.

Use initialization lists: initialize the missing m_VertexCount ivar to
zero; SmartPointer initialize themselves to the null pointer.

Print all member variables. Use the itkPrintSelfObjectMacro macro to
print SmartPointers.

Remove unnecessary/ducplicate/empty Doxygen-style, out-of-method-body
documentation from the implementation file/use it to improve the method
documentation in the header file.

Use the 'this' keyword when calling self functions.

Make flow control variables have a more local scope.

Start comments with capital letters.
Exercise basic object methods. Hence, remove the unncessary calls to the
GetNameOfClass or Print methods of the class instance.

Exercise all Set/Get methods using the TEST_SET_GET_VALUE macro.

Remove unnecessary print of input mesh vertex and cell count.

Refactor the test in order to improve the readability: create a helper
function to parse the input vector fields.

Other style changes for the sake of consistency across ITK: return
codes, white spaces, comment styles, etc.
ENH: Improve code coverage for itk::VectorFieldPCA.
STYLE: Improve the itkVectorFieldPCA class style.
Add the #ifndef ITK_MANUAL_INSTANTIATION statement for explicit object
instantiation.
…antiation

ENH: Include the ITK_MANUAL_INSTANTIATION macro.
…ulesFlags

AndSubdirs

ENH: Honor the ITK conventions to name modules, flags and folders.
Require CMake minimum version to be 3.9.5 following ITKv5 requiring
C++11:
https://discourse.itk.org/t/minimum-cmake-version-update/585
…umVersion

ENH: Require cmake minimum version to be 3.9.5.
ITKv5 removed deprecated interfaces.
Provide remove virtual and override

Use clang-tidy to add ITK_OVERRIDE, and to remove
redundant virtual on functions.

cd ../ITK;
clang-tidy -p ITK-clangtidy $find Modules/[A-J]*  -name *.cxx |fgrep -v ThirdParty) -checks=-*,modernize-use-override  -header-filter=.* -fix
clang-tidy -p ITK-clangtidy $(find Modules/[K-Z]*  -name *.cxx |fgrep -v ThirdParty) -checks=-*,modernize-use-override  -header-filter=.* -fix

https://stackoverflow.com/questions/39932391/virtual-override-or-both-c

When you override a function you don't technically need to write either virtual
or override.

The original base class declaration needs the keyword virtual to mark it as virtual.

In the derived class the function is virtual by way of having the ¹same type as
the base class function.

However, an override can help avoid bugs by producing a compilation error when
the intended override isn't technically an override. E.g. that the function
type isn't exactly like the base class function. Or that a maintenance of the
base class changes that function's type, e.g. adding a defaulted argument.

In the same way, a virtual keyword in the derived class can make such a bug
more subtle, by ensuring that the function is still is virtual in further
derived classes.

So the general advice is,

virtual for the base class function declaration.
This is technically necessary.

Use override (only) for a derived class' override.
This helps with maintenance.
git grep -l "ITK_OVERRIDE" |   fgrep -v itk_compiler_detection.h | fgrep -v CMakeLists.txt |fgrep -v .cmake |   xargs sed -i '' -e "s/ITK_OVERRIDE/override/g"
hjmjohnson and others added 7 commits March 1, 2020 17:07
Find and remove redundant void argument lists.
Converts a default constructor’s member initializers into the new
default member initializers in C++11. Other member initializers that match the
default member initializer are removed. This can reduce repeated code or allow
use of ‘= default’.
The mission of NumFOCUS is to promote open
practices in research, data, and scientific
computing.

https://numfocus.org
Fixes changes made in InsightSoftwareConsortium#2053. ITK_DISALLOW_COPY_AND_ASSIGN will be used if ITK_FUTURE_LEGACY_REMOVE=OFF.
The ability to include either .h or .hxx files as
header files required recursively reading the
.h files twice.  The added complexity is
unnecessary, costly, and can confuse static
analysis tools that monitor header guardes (due
to reaching the maximum depth of recursion
limits for nested #ifdefs in checking).
@github-actions github-actions Bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Enhancement Improvement of existing methods or implementation area:Python wrapping Python bindings for a class type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct area:Numerics Issues affecting the Numerics module area:Remotes Issues affecting the Remote module labels May 8, 2026
Copy link
Copy Markdown
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes are needed.

Comment thread Modules/Numerics/PrincipalComponentsAnalysis/itk-module.cmake Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/CMakeLists.txt Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/CMakeLists.txt Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/CMakeLists.txt Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/LICENSE Outdated
hjmjohnson and others added 6 commits May 8, 2026 13:05
Brings PrincipalComponentsAnalysis from a configure-time remote fetch into the ITK
source tree at Modules/Numerics/PrincipalComponentsAnalysis/ using the v4 ingestion
pipeline (whitelist filter-repo + per-commit clang-format + black +
commit-prefix sanitization).

Upstream repo:  https://github.com/InsightSoftwareConsortium/ITKPrincipalComponentsAnalysis.git
Upstream tip:   2e713d887d52153399700c02874774fe5bee3c20
Ingest date:    2026-05-08
Whitelist:      PrincipalComponentsAnalysis.list

Per-commit transforms applied across all 64 commits:
  - filter-repo --paths-from-file (whitelist)
  - filter-repo --to-subdirectory-filter Modules/Numerics/PrincipalComponentsAnalysis
  - clang-format -style=file (ITK main's .clang-format) for *.cxx/.h/.hxx/...
  - black for *.py
  - heuristic ITK prefix added to commit subjects without one

Merge topology preserved: 27 -> 15 merge(s).

Primary author: Johan Andruejol <Johan Andruejol>

Co-authored-by: Dženan Zukić <dzenan.zukic@kitware.com>
Co-authored-by: Francois Budin <francois.budin@gmail.com>
Co-authored-by: Hans J. Johnson <hans-johnson@uiowa.edu>
Co-authored-by: Hans Johnson <hans-johnson@uiowa.edu>
Co-authored-by: Hans Johnson <hans.j.johnson@gmail.com>
Co-authored-by: Johan Andruejol <johan.andruejol@kitware.com>
Co-authored-by: Jon Haitz Legarreta <jhlegarreta@vicomtech.org>
Co-authored-by: Jon Haitz Legarreta Gorroño <jhlegarreta@vicomtech.org>
Co-authored-by: Jon Haitz Legarreta Gorroño <jon.haitz.legarreta@gmail.com>
Co-authored-by: Mathew Seng <mathewseng@gmail.com>
Co-authored-by: Matt McCormick <matt.mccormick@kitware.com>
Co-authored-by: Matt McCormick <matt@mmmccormick.com>
Co-authored-by: Tom Birdsong <tom.birdsong@kitware.com>
itk-module.cmake: pass DOCUMENTATION through to DESCRIPTION
(was overwritten by a literal string).

CMakeLists.txt: drop the cmake_policy(CMP0003) shim, the
itk_module_examples() call, and the doc-build option block —
examples and doc/ were intentionally not ingested.

LICENSE: remove the per-module copy; ITK's root LICENSE applies.
@hjmjohnson hjmjohnson force-pushed the ingest-PrincipalComponentsAnalysis branch from e2737b1 to c51445a Compare May 8, 2026 18:08
@hjmjohnson hjmjohnson marked this pull request as ready for review May 8, 2026 18:12
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR ingests the standalone remote module ITKPrincipalComponentsAnalysis into Modules/Numerics/PrincipalComponentsAnalysis/, removing the corresponding Modules/Remote/*.remote.cmake file and enabling it in CI. The module provides itk::VectorFieldPCA for PCA on scalar, vector, and mesh-vertex data, along with a GaussianDistanceKernel helper, and is marked EXCLUDE_FROM_DEFAULT.

  • The header itkVectorFieldPCA.h defines both GaussianDistanceKernel and VectorFieldPCA; GaussianDistanceKernel has uninitialized double member fields that can trigger undefined behavior if Evaluate() is called before SetKernelSigma().
  • The implementation contains a structural dead-code pattern (redundant if (m_PointSet) after an unconditional throw) and the m_VertexCount member is never assigned.
  • The test file carries a duplicate #include and an unused debugOut stream variable inherited from the upstream.

Confidence Score: 4/5

The module is marked EXCLUDE_FROM_DEFAULT so it won't affect existing builds; the main concern is an uninitialized-member issue in GaussianDistanceKernel that would only affect callers who invoke Evaluate() before SetKernelSigma().

The overall ingest is well-structured and follows the v4 guidelines. The one substantive concern is that GaussianDistanceKernel's default constructor leaves m_KernelSigma and m_OneOverMinusTwoSigmaSqr uninitialized — calling Evaluate() on a freshly-constructed kernel before SetKernelSigma() produces undefined behavior. All other findings are editorial cleanups inherited from the upstream and do not affect correctness for normal use.

Modules/Numerics/PrincipalComponentsAnalysis/include/itkVectorFieldPCA.h deserves a second look for the uninitialized members in GaussianDistanceKernel.

Important Files Changed

Filename Overview
Modules/Numerics/PrincipalComponentsAnalysis/include/itkVectorFieldPCA.h Header for GaussianDistanceKernel and VectorFieldPCA; GaussianDistanceKernel has uninitialized double members and m_VertexCount is declared but never written.
Modules/Numerics/PrincipalComponentsAnalysis/include/itkVectorFieldPCA.hxx Template implementation of VectorFieldPCA; contains a redundant always-true null check after an unconditional throw.
Modules/Numerics/PrincipalComponentsAnalysis/test/itkVectorKernelPCATest.cxx Test for kernel PCA; has a duplicate vnl_vector.h include and an unused debugOut std::ofstream variable.
Modules/Numerics/PrincipalComponentsAnalysis/itk-module.cmake Module descriptor; correctly lists ITKCommon, ITKMesh, ITKIOMesh, ITKIOImageBase as DEPENDS and sets EXCLUDE_FROM_DEFAULT.
Modules/Numerics/PrincipalComponentsAnalysis/test/CMakeLists.txt Test CMakeLists correctly registers itkVectorKernelPCATest with all 41 ExternalData .cid inputs via DATA{}.
Modules/Remote/PrincipalComponentsAnalysis.remote.cmake Remote module descriptor correctly removed now that the module is in-tree.
pyproject.toml Adds -DModule_PrincipalComponentsAnalysis:BOOL=ON to the configure-ci task so the new module is built in CI.
Modules/Numerics/PrincipalComponentsAnalysis/wrapping/itkGaussianDistanceKernel.wrap Wraps GaussianDistanceKernel for all real types via ITK's standard itk_wrap_class/itk_wrap_template pattern; looks correct.

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class KernelFunctionBase~TRealValueType~ {
        <<abstract>>
        +Evaluate(u: TRealValueType) TRealValueType
    }
    class GaussianDistanceKernel~TRealValueType~ {
        -m_KernelSigma: double
        -m_OneOverMinusTwoSigmaSqr: double
        +SetKernelSigma(s: double)
        +Evaluate(u: TRealValueType) TRealValueType
    }
    class Object {
        <<ITK>>
    }
    class VectorFieldPCA~TVectorFieldElementType, TPCType~ {
        -m_PCAEigenValues: VectorType
        -m_BasisVectors: BasisSetTypePointer
        -m_VectorFieldSet: VectorFieldSetTypePointer
        -m_PointSet: InputPointSetPointer
        -m_KernelFunction: KernelFunctionPointer
        -m_ComponentCount: unsigned int
        -m_VertexCount: unsigned int
        +Compute()
        +GetAveVectorField() MatrixType
        +GetPCAEigenValues() VectorType
        +GetBasisVectors() BasisSetType
    }
    KernelFunctionBase <|-- GaussianDistanceKernel
    Object <|-- VectorFieldPCA
    VectorFieldPCA --> KernelFunctionBase : uses (optional)
Loading

Reviews (1): Last reviewed commit: "COMP: Address PrincipalComponentsAnalysi..." | Re-trigger Greptile

Comment thread Modules/Numerics/PrincipalComponentsAnalysis/include/itkVectorFieldPCA.h Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/include/itkVectorFieldPCA.hxx Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/test/itkVectorKernelPCATest.cxx Outdated
Comment thread Modules/Numerics/PrincipalComponentsAnalysis/test/itkVectorKernelPCATest.cxx Outdated
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 8, 2026
Adds five fixups that reviewers have flagged on every recent ingest
(PRs InsightSoftwareConsortium#6238 TextureFeatures, InsightSoftwareConsortium#6240 PrincipalComponentsAnalysis):

  * itk_module_examples()   — examples/ is not ingested
  * cmake_policy(CMP...)    — ITK top-level pins all policies
  * cmake_dependent_option(Module_${X}_BUILD_DOCUMENTATION ...)
    and the matching if(...) add_subdirectory(doc) endif() block
  * per-module LICENSE / COPYING — ITK's root LICENSE applies in-tree
  * ${DOCUMENTATION} kept as-is when sourced from a literal
    set(DOCUMENTATION ...) block; only the file(READ README.md ...)
    form is replaced by a static one-liner.
itkVectorFieldPCA.h: in-class-initialize the GaussianDistanceKernel
members so Evaluate() before SetKernelSigma() is well-defined; drop
the unused m_VertexCount member that was never assigned.

itkVectorFieldPCA.hxx: drop the redundant inner if(m_PointSet) check
— control reaches it only after the preceding if(!m_PointSet) throw.
Remove the m_VertexCount line from PrintSelf().

itkVectorKernelPCATest.cxx: drop the duplicated vnl_vector.h include
and the unused debugOut std::ofstream.
@hjmjohnson hjmjohnson merged commit 4c27e0b into InsightSoftwareConsortium:main May 9, 2026
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Numerics Issues affecting the Numerics module area:Python wrapping Python bindings for a class area:Remotes Issues affecting the Remote module type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants