Skip to content

Fix eigvals gradient for symmetric matrices (FluxML/Zygote#1369)#841

Open
CarloLucibello wants to merge 1 commit into
JuliaDiff:mainfrom
CarloLucibello:cl/eigvals-symmetric-1369
Open

Fix eigvals gradient for symmetric matrices (FluxML/Zygote#1369)#841
CarloLucibello wants to merge 1 commit into
JuliaDiff:mainfrom
CarloLucibello:cl/eigvals-symmetric-1369

Conversation

@CarloLucibello

Copy link
Copy Markdown
Contributor

Fixes the silent wrong gradient reported in FluxML/Zygote#1369.

Problem

rrule(eigen, A::StridedMatrix) reuses the symmetric-manifold cotangent convention whenever A happens to be Hermitian: it pushes the cotangent through _symherm_back and triu!, projecting it onto the stored (upper) triangle and zeroing the other triangle. That is the right convention for a Symmetric/Hermitian wrapper (where only one triangle is a free variable), but wrong for a plain Matrix, whose entries are all independent.

The result is a gradient that disagrees with ForwardDiff/finite differences and is discontinuous at exact symmetry:

julia> using LinearAlgebra, Zygote

julia> Zygote.jacobian(eigvals, [1 2; 2+1e-13 3])[1]   # ~symmetric → correct (split)
2×4 Matrix{Float64}:
 0.723607  -0.447214  -0.447214  0.276393
 0.276393   0.447214   0.447214  0.723607

julia> Zygote.jacobian(eigvals, [1 2; 2 3])[1]         # exactly symmetric → WRONG (zero column)
2×4 Matrix{Float64}:
 0.723607  0.0  -0.894427  0.276393
 0.276393  0.0   0.894427  0.723607

Fix

eigen_rev! already produces (via _hermitrizelike!) a cotangent whose off-diagonal eigenvalue sensitivity is split evenly across both triangles. We simply materialise that full matrix instead of collapsing it onto one triangle.

Scope: real matrices, eigenvalues only (T <: Real && ΔV isa AbstractZero), i.e. the eigvals path. The eigenvector phase convention differs between the symmetric (syev) and general (geev) algorithms, and the complex-Hermitian case can't be validated against FiniteDifferences (the eigenvalues leave the reals, so to_vec mismatches), so those paths are deliberately left unchanged. The Symmetric/Hermitian wrapper rules in symmetric.jl are untouched and still return structured tangents.

Verification

  • test_rrule(eigvals, Matrix(Hermitian(randn(n, n)))) now passes (previously failed on the symmetric input).
  • End-to-end: jacobian(eigvals, [1 2; 2 3]) now matches finite differences.
  • Updated the dense "hermitian" eigvals/eigen tests so the real eigenvalue gradient is checked against the unwrapped eigvals/eigen (general-matrix convention); the eigenvector and complex paths keep the Matrix(Hermitian(·)) reference.
  • Full factorization.jl (2977) and symmetric.jl (5499) test suites pass.

🤖 Generated with Claude Code

`rrule(eigen, A::StridedMatrix)` reused the symmetric-manifold cotangent
convention whenever `A` happened to be Hermitian: it projected the cotangent
onto the stored (upper) triangle via `_symherm_back`/`triu!`, zeroing the
other triangle. For a plain matrix whose entries are all independent free
variables this is wrong — it disagrees with ForwardDiff/finite differences and
makes the gradient discontinuous as `A` crosses exact symmetry. Concretely,
`jacobian(eigvals, A)` returned an erroneous all-zero column for an exactly
symmetric `A`, while a matrix one ULP away gave the correct split gradient.

`eigen_rev!` already produces (via `_hermitrizelike!`) a cotangent with the
off-diagonal eigenvalue sensitivity split evenly across both triangles, so we
simply materialise that full matrix instead of collapsing it onto one triangle.

Scope: real matrices, eigenvalues only (`T <: Real && ΔV isa AbstractZero`) —
i.e. the `eigvals` path. The eigenvector phase convention differs between the
symmetric and general algorithms, and the complex-Hermitian case cannot be
pinned down against FiniteDifferences (eigenvalues leave the reals), so those
paths are unchanged.

Updated the dense "hermitian" eigvals/eigen tests to check the eigenvalue
gradient of a real matrix against the *unwrapped* `eigvals`/`eigen` (the
general-matrix convention); the eigenvector and complex paths keep the
`Matrix(Hermitian(·))` reference. Verified `test_rrule(eigvals, A)` now passes
for a symmetric `A`, and end-to-end that `jacobian(eigvals, [1 2; 2 3])` matches
finite differences. Full factorization (2977) and symmetric (5499) test suites
pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
∂hermA = eigen_rev!(hermA, λ, V, Δλ, ∂V)
∂Atriu = _symherm_back(typeof(hermA), ∂hermA, Symbol(hermA.uplo))
∂A = ∂Atriu isa AbstractTriangular ? triu!(∂Atriu.data) : ∂Atriu
if T <: Real && ΔV isa AbstractZero

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just limiting this whole branch (on the outer level) to Hermitian and Symmetric{<:Real} instead of ishermitian?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants