Skip to content

IndexedElemwise torelate aliased read-writes#2255

Draft
ricardoV94 wants to merge 3 commits into
pymc-devs:mainfrom
ricardoV94:indexed_elemwise_tolerate_aliased
Draft

IndexedElemwise torelate aliased read-writes#2255
ricardoV94 wants to merge 3 commits into
pymc-devs:mainfrom
ricardoV94:indexed_elemwise_tolerate_aliased

Conversation

@ricardoV94

Copy link
Copy Markdown
Member
  • Stop downloading/uploading previoun benchmark data
  • Subtensor rewrites: reason about advanced indices jointly when gating
  • Fuse safe read-modify-write into IndexedElemwise

@ricardoV94 ricardoV94 force-pushed the indexed_elemwise_tolerate_aliased branch from 983d104 to 8dcf261 Compare June 25, 2026 17:38
A write whose buffer is also read no longer always bails: when every
aliasing read is through the same variable as the write target, uses the
write's index, and that index is duplicate-free, fuse it and set
destroyhandler_tolerate_aliased instead.
Generalize the IndexedElemwise fusion to absorb basic IncSubtensor/SetSubtensor
slice writes (gh pymc-devs#2192), not just advanced indexing. An Elemwise whose result is
written into buffer[slices] now writes straight into a view of the destination,
eliminating the intermediate temp + copy. Basic and advanced writes can also be
fused into a single loop.

Rewrite (FuseIndexedElemwise): detect single-client basic IncSubtensor writes of
an Elemwise output, with a coverage check; encode them via the inner IncSubtensor
node + destroy_map (no new spec). Slice-bound vars become inner inputs. Drop the
gate against mixing basic and advanced writes in one op.

Numba dispatch: one unified funcify for advanced-only, basic-only and mixed ops.
prepare_inputs is a FrozenFunctionGraph of Subtensor views compiled through the
normal dispatch (reusing the Subtensor slice codegen); the views are passed as
in-place core inputs to a single _vectorized call alongside any advanced index
specs. store_core_outputs gains n_inplace_buffer_inputs to accept (and drop) the
buffer inputs an inplace_pattern target requires. Pure-advanced ops are unchanged.

The op stays a portable OpFromGraph (inner Elemwise + IncSubtensor), so non-Numba
backends evaluate it correctly; only the fast slice-write path is Numba-specific.
@ricardoV94 ricardoV94 force-pushed the indexed_elemwise_tolerate_aliased branch from 8dcf261 to f4c7a0e Compare June 27, 2026 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant