Skip to content

502 ippl does consistently consider striding in indexes#503

Open
aaadelmann wants to merge 14 commits into
masterfrom
502-ippl-does-consistently-consider-striding-in-indexes
Open

502 ippl does consistently consider striding in indexes#503
aaadelmann wants to merge 14 commits into
masterfrom
502-ippl-does-consistently-consider-striding-in-indexes

Conversation

@aaadelmann
Copy link
Copy Markdown
Member

@aaadelmann aaadelmann commented May 12, 2026

Summary

This PR adds a subset of old-IPPL field indexing behavior to the Kokkos-based implementation:

  • strided Index / NDIndex access for BareField
  • rectangular BareField[NDIndex] and chained BareField[Index]... subsets
  • sparse BareField[SIndex] subsets
  • sparse predicate construction from field expressions
  • sparse subset predicate construction
  • sparse offset access through SOffset
  • SIndex<Dim> as a sparse set of selected field points.

The goal is to make the common strided and sparse BareField workflows work
without attempting to port the full old SubField / SubParticleAttrib machinery in this PR.

The main goal is that Index, NDIndex, and sparse SIndex field operations consistently respect
strides and can be used naturally in field expressions.

User-Facing Field Indexing

Dense subdomains can now be assigned directly:

ippl::NDIndex<Dim> subdomain = field.getDomain();
subdomain[0] = ippl::Index(subdomain[0].first(),
                           subdomain[0].last(),
                           2 * subdomain[0].stride());

field[subdomain] = 7.0;
lhs[subdomain] = rhs[subdomain] + 2.0;

Chained indexing is also supported:

field[I][J][K] = 7.0;
lhs[I][J][K] = rhs[I][J][K] + 2.0;

The indexed views preserve the selected global index coordinates and map them back to local Kokkos view coordinates, including ghost offsets.

Index strides are now respected when accessing a BareField through an NDIndex.

ippl::NDIndex<Dim> a = A.getDomain();
a[0] = ippl::Index(a[0].first(), a[0].last(), 2 * a[0].stride());

A[a] = 7.0;
A[a] = B[a] + 2.0;

Chained indexing is also supported:

A[i][j][k] = 7.0;
A[i][j][k] = B[i][j][k] + 2.0;

Compatible subsets with different physical strides are supported as long as
they have the same logical extent:

A[a] = B[b] + 2.0;

In this case, the left and right sides are evaluated in their own subdomain
coordinate systems.

Sparse field subsets

Sparse selected points can be assigned directly:

ippl::SIndex<Dim> s(layout);
s.addIndex(point);

A[s] = 7.0;
A[s] = B[s] + 2.0;

Compound assignment is supported for sparse field subsets:

A[s] += 1.0;
A[s] *= 1.01;
A[s] = B[s] + C[s];

Sparse predicates

Sparse indexes can be constructed from field predicates:

ippl::SIndex<Dim> s(layout);
s = A > 2.0;

B[s] = A[s];

Predicate construction can also be restricted to a strided subset:

ippl::NDIndex<Dim> subset = A.getDomain();
subset[0] = ippl::Index(subset[0].first(), subset[0].last(),
                        2 * subset[0].stride());

s[subset] = A[subset] > 2.0;
B[s] = A[s];

Sparse offsets

Sparse offset expressions are supported through SOffset:

ippl::SOffset<Dim> offset;
offset[0] = A.getDomain()[0].stride();

A[s] = B[s + offset] + 2.0;
A[s] = B[s(offset)] + 2.0;

Strided Index Semantics

Index behavior now accounts for actual grid points rather than only interval endpoints:

  • Index::contains() checks that all points in the candidate index are present in the containing index.
  • Index::touches() requires a real strided intersection.
  • Index::grow(n) grows coordinate bounds by n * stride.
    This fixes cases such as even and odd strided indexes being incorrectly treated as touching.

Sparse SIndex Access

field[si] = 7.0;
lhs[si] = rhs[si] + 2.0;

Predicate-Built Sparse Indexes

Sparse indexes can be built from field predicates, matching the main old TestSIndex.cpp workflow:

ippl::SIndex<Dim> si(layout);

si = field > threshold;
out[si] = field[si];

Subsetted predicate construction is also supported:

ippl::NDIndex<Dim> subdomain = field.getDomain();
subdomain[0] = ippl::Index(subdomain[0].first(),
                           subdomain[0].last(),
                           2 * subdomain[0].stride());

si[subdomain] = field[subdomain] > threshold;
out[si] = field[si];

A compatibility helper is provided:

si = gt(field, threshold);

Sparse Compound Assignment

Sparse field views support compound updates:

field[si] += value;
field[si] -= value;
field[si] *= value;
field[si] /= value;

field[si] += rhs[si];
field[si] -= rhs[si];
field[si] *= rhs[si];
field[si] /= rhs[si];

This covers the old benchmark pattern:

B[SI] *= 1.01;

SIndex Set Operations

The following sparse set operations are available:

si &= otherSi;
si |= otherSi;

si &= domain;
si |= domain;

si &= offset;
si |= offset;

Implementation notes

  • IndexedBareField implements rectangular BareField subset access.
  • SparseIndexedBareField implements sparse BareField[SIndex] access.
  • SIndex stores selected points and supports predicate assignment, set
    operations, and SOffset shifts needed by the sparse field paths.
  • Expression capture remains compatible with CUDA/HIP device lambdas by using
    the existing captured-expression reinterpretation path.
  • Kernels use the relevant field/expression execution space for the covered
    field and sparse-index operations.

Tests

Added coverage for:

  • strided Index::contains, touches, and grow
  • dense field[NDIndex] assignment and expression assignment
  • chained dense field[I][J]... assignment and expression assignment
  • multi-dimensional coverage from 1D through 6D for float and double
  • explicit multi-rank BareField execution
  • strided Index semantics
  • strided rectangular scalar assignment
  • strided rectangular expression assignment
  • differently-strided compatible rectangular assignment:
    A[a] = B[b] + scalar
  • sparse scalar and expression assignment
  • sparse RHS offset expressions
  • sparse predicate construction and assignment
  • sparse offset expression assignment
  • sparse set operations
  • sparse subset predicate assignment
  • sparse compound assignment

Deferred

  • ParticleAttrib[SIndex] interop from old IPPL is intentionally deferred for a follow-up PR.
  • SubParticleAttrib
  • full old SubField / SubBareField parity
  • full SubFieldTraits generic composition
  • lower-dimensional SIndex[NDIndex<Dim2>]
  • complete IndexedSIndex<Dim, Brackets> chaining/generalization
  • old int* sparse offset overloads
  • sparse particle/field expression interoperation such as PA[S] = A[S]

@biddisco
Copy link
Copy Markdown
Collaborator

biddisco commented May 13, 2026

cscs-ci run cscs-ci-gh200, cscs-ci-mi300, cscs-ci-openmp

Avoid routing SIndex predicate assignment through addIndex(), which
performs a linear duplicate scan for every selected point. Predicate
assignment clears the index and visits each local candidate point once,
so selected points can be appended directly after reserving capacity.

This removes the quadratic host-side construction cost that made the
high-rank SparseIndexSubsetPredicateAssignment tests look like a hang,
especially under MPI/GPU test runs.
aaadelmann and others added 5 commits May 18, 2026 06:09
Keep the existing reinterpret-cast capture model, but align
CapturedExpression and its byte buffer to the captured expression type.
This avoids reinterpreting byte-aligned storage as expression objects
that may contain stricter-aligned members such as Kokkos views.

Also speed up predicate-built SIndex construction by reserving selected
capacity and appending selected points directly. Predicate assignment
clears the sparse index and visits each candidate point once, so the
public addIndex duplicate scan is unnecessary there and caused
quadratic host-side cost in high-rank sparse-index tests.
Add a focused Field regression test for power-of-two 3D subdomains.
The test initializes a cubic field to one, verifies the full-field sum,
then repeatedly assigns smaller N/2, N/4, ..., 2^3 subdomains into a
zeroed target field and checks that the summed value matches the
expected subdomain volume.

The target keeps the same global layout as the source so the test is
stable under MPI-backed ctest runs while still exercising the Field
subdomain access and assignment path.
Add a BareField regression for assigning between compatible rectangular
subsets with different physical strides:

    A[a] = B[b] + scalar

The test constructs local owned subdomains where the left-hand side uses a
larger stride than the right-hand side while both subsets have the same
logical extent. It verifies that IndexedBareField evaluates each side in its
own subdomain coordinates instead of assuming identical strides.
@aaadelmann aaadelmann marked this pull request as ready for review May 19, 2026 19:47
@aaadelmann aaadelmann requested a review from srikrrish May 19, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working gitlab-mirror

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IPPL doesn't consistently consider striding in indexes

3 participants