You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/components/fundable/descriptions/Float16SupportInXsimd.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
-
#FP16 Support in xsimd
1
+
#### Overview
2
2
3
-
xsimd is a C++ header-only library that abstracts SIMD (vectorization) intrinsics behind a single, generic API.
3
+
Xsimd is a C++ header-only library that abstracts SIMD (vectorization) intrinsics behind a single, generic API.
4
4
The same code — `xsimd::batch<float>` — compiles to optimal machine code on x86 SSE/AVX, ARM NEON SVE, RISC-V, and WebAssembly, with no runtime overhead.
5
5
When an intrinsic is missing on a given target, xsimd falls back gracefully rather than failing or leaving the developer to write platform-specific branches.
6
6
This is why projects like Mozilla Firefox, Apache Arrow, Meta Velox, KDE Krita, and Pythran have adopted it as their vectorization layer.
@@ -13,7 +13,7 @@ xsimd currently has no FP16 support, forcing its users to drop out of the generi
13
13
14
14
We propose to add vectorized FP16 support to xsimd — native FP16 operations where hardware supports them, and correct fallbacks elsewhere.
15
15
16
-
## Why FP16 Matters
16
+
####Why FP16 Matters
17
17
18
18
**Memory bandwidth is a bottleneck.** Modern CPUs and GPUs are not compute-bound — they are memory-bandwidth-bound.
19
19
FP16 cuts data size in half versus FP32.
@@ -37,7 +37,7 @@ FP16 conversion and arithmetic are now widely available across all major SIMD fa
37
37
This affects NEON operations on modern smartphones and all Apple silicon M-chips.
38
38
Coverage is extended server side with both SVE and SVE2 supporting FP16.
39
39
40
-
## Proposed Work
40
+
####Proposed Work
41
41
42
42
This proposal covers foundational FP16 support: native FP16 operations on platforms that provide hardware acceleration, and correct, efficient fallbacks everywhere else.
43
43
@@ -46,7 +46,7 @@ Concretely, this means:
46
46
- Support for converting from and to `batch<float>`, mapping to the optimal hardware instruction where available, and a correct SIMD algorithm elsewhere.
47
47
- Native FP16 arithmetic operations — add, multiply, FMA, min, max, and comparison — on backends that provide hardware support, with FP32-based fallbacks on those that do not
48
48
49
-
## Impact
49
+
####Impact
50
50
51
51
Funding this development will directly open xsimd to the rapidly growing landscape of LLM and machine
52
52
learning workflows: local inference engines, model weight processing, and embedding pipelines.
0 commit comments