Skip to content

WIP: Add NEON simulator and statespace headers#1050

Open
xxie24 wants to merge 1 commit intoquantumlib:mainfrom
xxie24:add-neon
Open

WIP: Add NEON simulator and statespace headers#1050
xxie24 wants to merge 1 commit intoquantumlib:mainfrom
xxie24:add-neon

Conversation

@xxie24
Copy link
Copy Markdown

@xxie24 xxie24 commented Apr 20, 2026

This PR adds initial ARM NEON integration by introducing:

  • lib/simulator_neon.h
  • lib/statespace_neon.h
  • lib/simmux.h dispatch updates for __ARM_NEON__

Notes:

  • The NEON headers are ARM-NEON-only and use __ARM_NEON__.
  • Naming and public structure were aligned with the SSE counterparts.

@github-actions github-actions Bot added the size: XL lines changed >1000 label Apr 20, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces NEON vectorization support for the quantum circuit simulator, including a new SimulatorNEON class and a corresponding StateSpaceNEON implementation. The feedback identifies a critical type safety issue where a template parameter could lead to compilation errors and suggests using existing vectorized methods for state norm calculations to improve performance.

Comment thread lib/simulator_neon.h Outdated
/**
* Quantum circuit simulator with NEON vectorization.
*/
template <typename For, typename FP = float>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The FP template parameter in SimulatorNEON is misleading and potentially harmful. StateSpaceNEON is hardcoded to use float (line 36), and the NEON kernels are specifically implemented for single-precision arithmetic. If SimulatorNEON is instantiated with FP = double, it will cause compilation errors in fallback methods (e.g., line 815) because of a type mismatch between the float* matrix and the double fallback simulator. Since this class is strictly for float precision, the FP parameter should be removed.

Suggested change
template <typename For, typename FP = float>
template <typename For>

Comment thread lib/simulator_neon.h Outdated
Comment on lines +39 to +40
using BasicStateSpace = StateSpaceBasic<For, FP>;
using BasicState = typename BasicStateSpace::State;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Following the removal of the FP template parameter, these types should be explicitly defined using float to maintain consistency with StateSpaceNEON.

Suggested change
using BasicStateSpace = StateSpaceBasic<For, FP>;
using BasicState = typename BasicStateSpace::State;
using BasicStateSpace = StateSpaceBasic<For, float>;
using BasicState = typename BasicStateSpace::State;

Comment thread lib/simulator_neon.h Outdated

For for_;
BasicStateSpace basic_state_space_;
SimulatorBasic<For, FP> fallback_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback simulator should use float to match the precision of the NEON state space.

Suggested change
SimulatorBasic<For, FP> fallback_;
SimulatorBasic<For, float> fallback_;

Comment thread lib/statespace_neon.h Outdated
Comment on lines +320 to +330
double norm = 0;
uint64_t size = MinSize(state.num_qubits()) / 8;
const fp_type* p = state.get();

for (uint64_t k = 0; k < size; ++k) {
for (unsigned j = 0; j < 4; ++j) {
double re = p[8 * k + j];
double im = p[8 * k + 4 + j];
norm += re * re + im * im;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This scalar loop for calculating the state norm is redundant and inefficient. StateSpaceNEON already provides a vectorized Norm() method (inherited from StateSpace and implemented via PartialNorms) which should be used instead to improve performance, especially for large state vectors.

Suggested change
double norm = 0;
uint64_t size = MinSize(state.num_qubits()) / 8;
const fp_type* p = state.get();
for (uint64_t k = 0; k < size; ++k) {
for (unsigned j = 0; j < 4; ++j) {
double re = p[8 * k + j];
double im = p[8 * k + 4 + j];
norm += re * re + im * im;
}
}
double norm = this->Norm(state);
uint64_t size = MinSize(state.num_qubits()) / 8;
const fp_type* p = state.get();

@xxie24 xxie24 force-pushed the add-neon branch 3 times, most recently from 01b523d to 7b7e2e8 Compare April 20, 2026 21:58
@xxie24 xxie24 changed the title Add NEON simulator and statespace headers WIP: Add NEON simulator and statespace headers Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: XL lines changed >1000

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant