GDTF reading optimization for big files.#146
Open
VzhelevVector wants to merge 1 commit intomasterfrom
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Optimize GDTF File Reading & Reference Resolution
Summary
Performance optimization for reading and validating large GDTF files. All changes are internal — the public API is unchanged. Behavioral parity with the original code is preserved for all file sizes through a threshold mechanism that activates optimizations only for larger files.
What's New
Hash Map Indexes for Reference Lookups
Reference resolution (attributes, wheels, emitters, filters, connectors, color spaces, gamuts, DMX profiles, models, sub-physical units, geometries) now uses
unordered_mapwith O(1) lookups instead of linear scans. For files below the activation threshold, the original linear search is used unchanged.A flat geometry index is built once by recursively traversing the geometry tree. This allows instant name-to-geometry lookups without repeated tree walks, while preserving the actual tree structure.
Per-Mode Channel & Function Indexes
Each DMX mode builds its own channel and function indexes after channels are fully resolved. This replaces triple-nested loops (channels → logical channels → functions) with a single map lookup in macro resolution, relation resolution, and mode master resolution.
Set-Based Duplicate Detection
Duplicate geometry+attribute combination detection in logical channels now uses an
unordered_setfor O(n) detection instead of the original O(n²) nested loop approach. Under the threshold, the original nested loop is preserved.Case-Insensitive Hash
A
TXStringNoCaseHashstruct is introduced for all hash containers. This is necessary becauseTXString::operator==uses case-insensitive comparison, but the defaultTXString::hash()is case-sensitive — using the default hash inunordered_mapwould break lookups where names differ only in case.Return-by-Reference Optimization
Multiple getters that returned vectors and strings by value now return
const&:GetChannelArray,GetLogicalChannelArray,GetDmxChannelFunctions,GetInternalGeometries,GetFeatureArray,GetSubPhysicalUnitArray,GetDmxRelations,GetDmxMacrosArrayGetUnresolved*/getUnresolved*methods that return member fieldsThis eliminates thousands of unnecessary copies during reference resolution.
Mode Master Short-Circuit
ResolveDMXModeMastersnow skips the function-level lookup (getDmxFunctionByRef) when the channel-level lookup has already resolved the mode master, avoiding a redundant search per resolved function.Attribute Validation Optimization
CheckNodeAttributesnow uses anunordered_setfor attribute matching instead of repeatedstd::find+eraseon aTXStringArray. Error reporting order is preserved by iterating the original XML attribute list.NoFeature Attribute Handling
Early detection of the
NoFeatureattribute is performed once during index building, regardless of whether indexing is active. ThegetAttributeByReffunction checks forNoFeaturebefore searching, avoiding unnecessary scans.Threshold Mechanism
Optimizations activate based on file complexity:
kIndexThresholdkModeIndexThresholdkDuplicateCheckThresholdFiles below these thresholds use the original code paths — zero overhead, zero behavioral change. Files above get O(1) lookups with
emplace()(first-wins semantics, matching the original linear scan behavior).