Skip to content

Fix reduce_add functions#3846

Open
zephyr111 wants to merge 6 commits into
ispc:mainfrom
zephyr111:fix-3834
Open

Fix reduce_add functions#3846
zephyr111 wants to merge 6 commits into
ispc:mainfrom
zephyr111:fix-3834

Conversation

@zephyr111
Copy link
Copy Markdown
Collaborator

@zephyr111 zephyr111 commented May 23, 2026

Description

This PR fixes #3834 . More specifically it:

  • Fix reduce_add(int8) for signed (negative) values;
  • Implement unsigned versions;
  • Change the output type so it is now the same than the input type (no widening) -- most target was not computing this correctly anyway;
  • Keep using psadbw in __reduce_add_uint8 though the output type is now uint8 instead of int16 for sake of consistency with other functions.
  • Fix the tests and add new ones, especially to catch the issue in __reduce_add_int8 and check the unsigned versions.

I think it would be great to add more stdlib functions with widened output type later. Indeed, it would be more efficient on x86 CPU for uint8 when the user want a int16/uint16 result (no uint8 wrap-around). On x86, I think we could also use the instruction pmaddwd for computing reduce_add(int16) -> int32 (AFAIK not yet done previously). AFAIK, RISC and SVE (SVE2?) targets could also provide better implementations than naive ones (though SVE is not supported yet and RISC is certainly used by very few people). It is not critical since SIMD reductions are known to be a bit expensive and they rarely are on a hot path anyway.

Please note that I also reorganized the order of some function for sake of clarity. I also added a missing undef (minor).

PS: updating the IR of the 19 targets is pretty tedious and bug prone.

Related Issue

  • Linked to relevant issue(s)

Checklist

  • Code has been formatted with clang-format (e.g., clang-format -i src/ispc.cpp)
  • Git history has been squashed to meaningful commits (one commit per logical change)
  • Compiler changes are covered by lit tests
  • Language/stdlib changes include new functional tests for runtime behavior
  • Documentation updated if needed

@zephyr111 zephyr111 marked this pull request as ready for review May 23, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Several discrepancies in the implementation of reduction functions

1 participant