WIP: Add NEON simulator and statespace headers#1050
WIP: Add NEON simulator and statespace headers#1050xxie24 wants to merge 1 commit intoquantumlib:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces NEON vectorization support for the quantum circuit simulator, including a new SimulatorNEON class and a corresponding StateSpaceNEON implementation. The feedback identifies a critical type safety issue where a template parameter could lead to compilation errors and suggests using existing vectorized methods for state norm calculations to improve performance.
| /** | ||
| * Quantum circuit simulator with NEON vectorization. | ||
| */ | ||
| template <typename For, typename FP = float> |
There was a problem hiding this comment.
The FP template parameter in SimulatorNEON is misleading and potentially harmful. StateSpaceNEON is hardcoded to use float (line 36), and the NEON kernels are specifically implemented for single-precision arithmetic. If SimulatorNEON is instantiated with FP = double, it will cause compilation errors in fallback methods (e.g., line 815) because of a type mismatch between the float* matrix and the double fallback simulator. Since this class is strictly for float precision, the FP parameter should be removed.
| template <typename For, typename FP = float> | |
| template <typename For> |
| using BasicStateSpace = StateSpaceBasic<For, FP>; | ||
| using BasicState = typename BasicStateSpace::State; |
There was a problem hiding this comment.
Following the removal of the FP template parameter, these types should be explicitly defined using float to maintain consistency with StateSpaceNEON.
| using BasicStateSpace = StateSpaceBasic<For, FP>; | |
| using BasicState = typename BasicStateSpace::State; | |
| using BasicStateSpace = StateSpaceBasic<For, float>; | |
| using BasicState = typename BasicStateSpace::State; |
|
|
||
| For for_; | ||
| BasicStateSpace basic_state_space_; | ||
| SimulatorBasic<For, FP> fallback_; |
| double norm = 0; | ||
| uint64_t size = MinSize(state.num_qubits()) / 8; | ||
| const fp_type* p = state.get(); | ||
|
|
||
| for (uint64_t k = 0; k < size; ++k) { | ||
| for (unsigned j = 0; j < 4; ++j) { | ||
| double re = p[8 * k + j]; | ||
| double im = p[8 * k + 4 + j]; | ||
| norm += re * re + im * im; | ||
| } | ||
| } |
There was a problem hiding this comment.
This scalar loop for calculating the state norm is redundant and inefficient. StateSpaceNEON already provides a vectorized Norm() method (inherited from StateSpace and implemented via PartialNorms) which should be used instead to improve performance, especially for large state vectors.
| double norm = 0; | |
| uint64_t size = MinSize(state.num_qubits()) / 8; | |
| const fp_type* p = state.get(); | |
| for (uint64_t k = 0; k < size; ++k) { | |
| for (unsigned j = 0; j < 4; ++j) { | |
| double re = p[8 * k + j]; | |
| double im = p[8 * k + 4 + j]; | |
| norm += re * re + im * im; | |
| } | |
| } | |
| double norm = this->Norm(state); | |
| uint64_t size = MinSize(state.num_qubits()) / 8; | |
| const fp_type* p = state.get(); |
01b523d to
7b7e2e8
Compare
This PR adds initial ARM NEON integration by introducing:
lib/simulator_neon.hlib/statespace_neon.hlib/simmux.hdispatch updates for__ARM_NEON__Notes:
__ARM_NEON__.