Optimize repeated shouldIgnoreStatistics calls during footer reading in ParquetMetadataConverter

### Describe the enhancement requested

`CorruptStatistics.shouldIgnoreStatistics(String createdBy, PrimitiveTypeName columnType)` performs `VersionParser.parse() `and `SemanticVersion.parse()` on every invocation. The `createdBy` string is constant per file (from `FileMetaData.created_by`), but this method is called once per column chunk per row group during `ParquetMetadataConverter.fromParquetMetadata()`.

For a file with R row groups and C columns, the version string is parsed R×C times during file open — all yielding the same result.

**Impact**
High CPU during `ParquetFileReader` construction on files with many high groups/columns.

**Proposed fix**
Compute `shouldIgnoreStatistics` once per file before the row group loop in `fromParquetMetadata`, and pass the pre-computed boolean through `buildColumnChunkMetaData` → `fromParquetStatisticsInternal`.

Since `buildColumnChunkMetaData` and `fromParquetStatisticsInternal` are public/package-level API methods enforced by japicmp-maven-plugin, we will add overloaded methods that accept a boolean `shouldIgnoreBinaryStats` parameter rather than changing existing signatures. The existing String `createdBy` signatures remain for backward compatibility and delegate to the new overloads.

The page-level path in `ParquetFileReader.Chunk.readAllPages()` also calls `shouldIgnoreStatistics` via `fromParquetStatisticsInternal`, but is not in scope for this fix — at read time the cost is masked by I/O, decompression, and decoding. The footer path during file open is where R×C calls happen in a tight loop with no I/O to amortize the cost.

Alternatively, caching on `createdBy` inside `shouldIgnoreStatistics` would also eliminate redundant parsing — with the advantage that parsing occurs only once per distinct version string across the entire process lifetime (rather than once per file open). 

However, the caller-side fix is preferred because it avoids per-call hash/equals overhead on the hot path (R×C times per file),  and keeps the optimization local to the scope where the redundancy occurs.

**Profiling data**
`shouldIgnoreStatistics` accounts for ~65% of `fromParquetMetadata` CPU time across multiple samples.
<img width="2549" height="714" alt="Image" src="https://github.com/user-attachments/assets/1c5a6302-fabf-4be1-8a64-4151a4f430a0" />

### Component(s)

Core

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize repeated shouldIgnoreStatistics calls during footer reading in ParquetMetadataConverter #3601

Describe the enhancement requested

Component(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Optimize repeated shouldIgnoreStatistics calls during footer reading in ParquetMetadataConverter #3601

Description

Describe the enhancement requested

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions