ET path: share interaction graph + analytic pair site_grads by jameskermode · Pull Request #327 · ACEsuit/ACEpotentials.jl

jameskermode · 2026-06-29T14:55:04Z

Optimises the ET (EquivariantTensors) backend force path. Found by profiling (benchmark/profile_et_forces.jl).

Stacked on #326 — base is fix/classic-force-evaluation-regression so the diff shows only the ET changes. Retarget to main once #326 merges. #326 is now merged, retarged this PR on main

Changes

1. Shared interaction graph in StackedCalculator. Each stacked component (onebody, pair, many-body) previously rebuilt its own interaction graph per call. A per-cutoff cache (gcache, keyed on each calculator's own rcut) is now threaded through the stacked energy/forces/virial/efv calls, so components that share a cutoff (pair + many-body) build the graph once. Keying on each calculator's own cutoff keeps per-component cutoffs exact (no single-graph approximation); non-WrappedSiteCalculator components fall back to the plain AtomsCalculators interface. _wrapped_* gain graph-accepting methods.

2. Analytic site_grads for ETPairModel (replaces Zygote.gradient). For the pair model site_basis_jacobian returns ∂𝔹 == ∂R directly (the pair basis is a linear sum over neighbours), so contracting it with the readout weights is exactly the per-edge gradient — no jacobian blow-up, and low coupling (uses only site_basis_jacobian/rev_reshape_embedding).

Not changed: the many-body ETACE.site_grads stays on Zygote. An analytic VJP via ET._ka_pullback was prototyped but only ~10–15% faster (ET's many-body kernel intermediates, not Zygote overhead, dominate the cost) and coupled too tightly to EquivariantTensors internals — not worth it.

Results (single-thread, Si/O)

full ET stacked forces	before	after
64 atoms	6.95 ms	5.06 ms (~27%)
256 atoms	24.9 ms	20.0 ms (~19%)
800 atoms	71.4 ms	61.5 ms (~14%)

ET/classic force ratio 2.31 → 2.06. ET remains ~2× slower than the classic path on CPU — its dominant cost is ET's many-body kernels, so the larger remaining win is upstream (leaner kernels / GPU), not in ACEpotentials.

Verification

test/et_models/test_et_calculators.jl (ET↔classic forces/virial/energy <1e-6, incl. StackedCalculator), test/etmodels/test_etace.jl, test_etpair.jl — all pass.
Forces bit-consistent with the classic calculator.

🤖 Generated with Claude Code

jameskermode · 2026-06-29T15:10:25Z

Currently only shares graphs where the cutoff is equal – which is the case for the pair and many-body graphs in the performance tests above - could also compute graph at max(rcut) and filter for the smaller cutoff ones. Open to suggestions on whether this is worth the extra book-keeping.

Two optimisations to the ET (EquivariantTensors) backend force path, found by profiling (benchmark/profile_et_forces.jl): 1. Shared interaction graph in StackedCalculator. Previously each stacked component (onebody, pair, many-body) rebuilt its own interaction graph per force/energy call. Now a per-cutoff cache (`gcache`, keyed on each calculator's own `rcut`) is threaded through the stacked calls, so components that share a cutoff (pair + many-body) build the graph once. Keying on each calculator's own cutoff keeps per-component cutoffs exact — no single-graph approximation. Non-WrappedSiteCalculator components fall back to the plain AtomsCalculators interface. `_wrapped_*` gain graph-accepting methods. 2. Analytic `site_grads` for ETPairModel, replacing `Zygote.gradient`. For the pair model `site_basis_jacobian` returns ∂𝔹 == ∂R directly (the pair basis is a linear sum over neighbours), so contracting it with the readout weights is exactly the per-edge gradient — no jacobian blow-up, low coupling (uses only site_basis_jacobian / rev_reshape_embedding). The many-body ETACE `site_grads` is left on Zygote: an analytic VJP was prototyped but only ~10-15% faster (ET's many-body kernel intermediates, not Zygote overhead, dominate) and coupled too tightly to ET internals. Result (single-thread, Si/O): full ET stacked forces ~14-27% faster (64 atoms 6.95→5.06 ms, 800 atoms 71.4→61.5 ms); ET/classic force ratio 2.31→2.06. Forces unchanged — test/et_models/test_et_calculators.jl (ET↔classic <1e-6), test/etmodels/test_etace.jl, test_etpair.jl all pass. Also adds benchmark/profile_et_forces.jl (ET force-path breakdown). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ETOneBody is structurally a one-body model — its energy depends only on atom species (node_data), `site_grads` returns empty edge gradients unconditionally, and forces/virial are identically zero. Building an interaction graph for it (the `convert2et_full` onebody used rcut=3.0) ran a neighbour search whose edges were then discarded. Specialise the ETOneBodyPotential energy/forces/virial (and the StackedCalculator `_cached_*` dispatch) to build the node states directly and skip the graph entirely. node_data is rcut-independent, so the result is identical to the graph path. Also drop the now-meaningless rcut=3.0 in convert2et_full (→ 0.0, unused). Verified: test/et_models/test_et_calculators.jl passes (onebody energy + stacked energy/forces/virial unchanged). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jameskermode mentioned this pull request Jun 29, 2026

Fix ~13x force-evaluation regression in classic ACEModel #326

Merged

jameskermode changed the base branch from fix/classic-force-evaluation-regression to main June 29, 2026 15:25

jameskermode and others added 2 commits June 29, 2026 16:27

jameskermode force-pushed the opt/et-graph-sharing branch from dd03745 to 7288ed1 Compare June 29, 2026 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ET path: share interaction graph + analytic pair site_grads#327

ET path: share interaction graph + analytic pair site_grads#327
jameskermode wants to merge 2 commits into
mainfrom
opt/et-graph-sharing

jameskermode commented Jun 29, 2026 •

edited

Loading

Uh oh!

jameskermode commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jameskermode commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Results (single-thread, Si/O)

Verification

Uh oh!

jameskermode commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jameskermode commented Jun 29, 2026 •

edited

Loading