Skip to content

refactor: Deduplicating CV code#988

Draft
benITo47 wants to merge 19 commits intomainfrom
@bo/deduplicateCvCode
Draft

refactor: Deduplicating CV code#988
benITo47 wants to merge 19 commits intomainfrom
@bo/deduplicateCvCode

Conversation

@benITo47
Copy link
Copy Markdown
Contributor

Description

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Screenshots

Related issues

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@benITo47 benITo47 force-pushed the @bo/deduplicateCvCode branch from 84f6e93 to 1953373 Compare March 20, 2026 15:24
@msluszniak msluszniak force-pushed the @bo/yoloObjectDetection branch 2 times, most recently from 2d33d1d to 7a0e899 Compare March 24, 2026 15:09
@msluszniak msluszniak changed the title First shot at deduplicating CV code refactor: Deduplicating CV code Mar 24, 2026
Base automatically changed from @bo/yoloObjectDetection to main March 25, 2026 10:08
@benITo47 benITo47 force-pushed the @bo/deduplicateCvCode branch from 1953373 to 6a090e7 Compare April 17, 2026 08:39
benITo47 added 18 commits April 17, 2026 10:44
Add initNormalization, createInputTensor, loadImageToRGB, loadFrameRotated, and loadFrameRotatedWithSize helpers to eliminate duplication across vision models.
Remove duplicated preprocessing code from ImageEmbeddings, Classification, StyleTransfer, ObjectDetection, BaseInstanceSegmentation, and BaseSemanticSegmentation (~105 lines removed).
Add ensureMethodLoaded, getModelInputSize, and currentlyLoadedMethod_ to support models with multiple methods (e.g., forward_384, forward_512).
…sing

Add prepareAllowedClasses and validateThreshold to Processing.{h,cpp} for reuse across detection models.
Simplify calculateModelImageSize to use BaseModel's getModelInputSize helper (~7 lines removed).
Centralize input shape validation logic across all models. Replaces duplicated validation code in 7 models (~84 lines removed).
Add documentation explaining when subclasses should:
- Call initNormalization() for models expecting ImageNet preprocessing
- Skip it for models with built-in normalization or raw input
- Note that createInputTensor() safely handles both cases via std::optional
Create header-only utility for converting tensors to std::span.

Provides:
- toSpan<T>(Tensor&): Convert tensor to typed span
- toSpan<T>(EValue&): Extract tensor from EValue then convert

Eliminates manual pointer arithmetic and improves type safety.
Replaces 8+ manual span constructions across vision models.
Add utility function to extract bbox, score, and label from detection
model tensor outputs.

Replaces private extractDetectionData() method in BaseInstanceSegmentation.
Provides reusable data extraction for detection models.
Add template function to apply inverse rotation to bboxes in containers.

Provides convenience helper for batch operations on detection/segmentation
results, eliminating manual loops in ObjectDetection and InstanceSegmentation.
Add convenience methods to reduce error-checking boilerplate:
- forwardOrThrow(EValue): Execute forward with single input
- forwardOrThrow(vector<EValue>): Execute forward with multiple inputs
- executeOrThrow(string, vector<EValue>): Execute named method

All methods throw RnExecutorchError on failure with customizable messages.
Replaces 4+ manual error-checking patterns across vision models.
Replace manual tensor-to-span conversion with utils::tensor::toSpan.
Replace forward error checking with forwardOrThrow helper.

Simplifies code and improves consistency with new utility patterns.
Replace manual tensor-to-span conversion with utils::tensor::toSpan.
Replace forward error checking with forwardOrThrow helper.

Simplifies code and improves consistency with new utility patterns.
Replace 3 manual tensor-to-span conversions with utils::tensor::toSpan.
Replace execute error checking with executeOrThrow helper.
Replace rotation loop with utils::inverseRotateBboxes batch helper.

Simplifies code and improves consistency with new utility patterns.
Replace execute error checking with executeOrThrow helper.
Replace private extractDetectionData with utils::computer_vision version.
Delete duplicate extractDetectionData method (now in shared utils).
Replace bbox rotation loop with utils::inverseRotateBboxes batch helper.

Mask rotation remains inline as it's instance-specific logic.
Replace forward error checking with forwardOrThrow helper.

Simplifies code and improves consistency with new utility patterns.
Replace forward error checking with forwardOrThrow helper.

Simplifies code and improves consistency with new utility patterns.
@benITo47 benITo47 force-pushed the @bo/deduplicateCvCode branch 2 times, most recently from c7703fe to 5b1a6fa Compare April 17, 2026 09:06
Add comprehensive unit tests for refactoring utilities:

- TensorHelpersTest: Test toSpan<T> for Tensor and EValue conversions
  * Float and int32 tensors
  * Multidimensional tensors
  * Empty tensors
  * Type safety and const correctness

- ComputerVisionProcessingTest: Test extractDetectionData
  * Single and multiple detections
  * Various indices and label formats
  * Edge cases (negative coords, fractional values)

- FrameTransformTest: Test inverseRotateBboxes batch helper
  * Batch rotation of multiple detections
  * Empty containers and single detection
  * Preservation of non-bbox fields

Updated CMakeLists.txt to register new test executables.
@benITo47 benITo47 force-pushed the @bo/deduplicateCvCode branch from 5b1a6fa to 10e089c Compare April 17, 2026 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants