Autogenerate core_arch modules#196
Conversation
bc24ec6 to
a693e8d
Compare
|
Apologies for the churn; this should be actually ready for review now. The NEON intrinsics in stdarch are a mess to sort through. |
|
Perhaps we should discuss (either Zulip or next office hours) what to do with the |
Shnatsel
left a comment
There was a problem hiding this comment.
A bunch of testing improvements have been merged into main recently. Merging main into this branch and seeing CI pass would make me much more confident in the correctness of this PR.
| /// Generate the constructor for this architecture. | ||
| /// | ||
| /// x86 and aarch64 require runtime feature detection, so the constructor is unsafe. | ||
| /// wasm32 has static feature detection, so the constructor is safe. |
There was a problem hiding this comment.
Aarch64 is guaranteed to have NEON and currently fearless_simd doesn't have any other Aarch64 levels other than baseline NEON. So this could actually be safe for now; making it unsafe is just for future-proofing.
|
I don’t think any of the core_arch stuff is tested in CI though, no? |
|
Only through high-level operations, I believe. So only a small subset. |
|
Aren't we using the intrinsics directly? I don't think we are running anything from |
|
Ah, nevermind then, I must have been mistaken |
| if target_features.is_empty() | ||
| || !target_features | ||
| .iter() | ||
| .any(|feature| feature == &self.module_feature) | ||
| { | ||
| return; | ||
| } |
There was a problem hiding this comment.
This accepts any intrinsic whose target_feature list merely contains the module feature. That pulls extra-feature AArch64 intrinsics into the plain Neon token: for example neon.rs:291 exposes safe vbcaxq_* wrappers even though rustc reports they require neon,sha3, and neon.rs:2537 includes p64 wrappers requiring aes. A Neon proof is not enough to call these safely, so this breaks the safety model. cargo check --target aarch64-unknown-linux-gnu --all-features also reports 137 mismatched target-feature warnings from this. The parser needs to reject intrinsics with unsupported extra features or generate separate feature tokens/modules.
There was a problem hiding this comment.
This also drops public intrinsics with no #[target_feature]. That removes existing public wrappers such as Sse2::_mm_pause, which stdarch intentionally defines without a target feature because pause is safe as a nop on older CPUs. This is a public API regression from the old core_arch, but may be acceptable depending on the exact list of intrinsics.
I'm not sure where the
core_archmodules originally came from. Raph's comment says he copied them from Pulp, which doesn't seem to document how or where they were generated.This PR replaces them with new autogenerated versions, created by parsing the
stdarchcrate. I've added it as a submodule underfearless_simd_gen.I've chosen to make
core_archgeneration a separate subcommand in thefearless_simd_genbinary; I expect it to be run much less often, and want to allow people to regenerate the rest of the code without having to clone the (relatively heavy)stdarchrepo.I'm not sure how much utility
core_architself provides right now. The main reason for doing this is that I've embarked on a bit of a yak-shaving expedition: