Skip to content

Parallelize the exhaustive midpoint test across hardware threads (~75x faster)#383

Open
fcostaoliveira wants to merge 1 commit into
fastfloat:mainfrom
redis-performance:pr/parallel-exhaustive
Open

Parallelize the exhaustive midpoint test across hardware threads (~75x faster)#383
fcostaoliveira wants to merge 1 commit into
fastfloat:mainfrom
redis-performance:pr/parallel-exhaustive

Conversation

@fcostaoliveira
Copy link
Copy Markdown
Contributor

exhaustive32_midpoint sweeps all 2³² float bit-patterns one at a time, which takes
~30 min on a fast machine. The values are independent, so this splits the range across
std::thread::hardware_concurrency() threads, with an atomic flag for fail-fast. The
per-value work is factored into a check_word() helper; the checks and pass/fail
behavior are unchanged.

On a 96-core machine the runtime drops from ~1900 s to ~25 s (≈75×), identical result:

gcc 13  : PASS in 26s
clang 18: PASS in 22s

Notes:

  • The exhaustive tests are gated behind FASTFLOAT_EXHAUSTIVE (off by default and not
    built in CI), so this has no CI impact — it's purely a faster local/dev sweep.
  • tests/CMakeLists.txt now links Threads::Threads (with
    THREADS_PREFER_PTHREAD_FLAG).
  • C++11, compiles clean under the existing -Werror -Wall -Wextra -Weffc++ -Wconversion -Wsign-conversion -Wshadow set on gcc and clang; clang-format clean.

The same pattern applies to the sibling sweeps (exhaustive32, exhaustive32_64); I
kept this PR to one test for review, happy to extend it if you'd like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant