Fix eigvals gradient for symmetric matrices (FluxML/Zygote#1369)#841
Open
CarloLucibello wants to merge 1 commit into
Open
Fix eigvals gradient for symmetric matrices (FluxML/Zygote#1369)#841CarloLucibello wants to merge 1 commit into
CarloLucibello wants to merge 1 commit into
Conversation
`rrule(eigen, A::StridedMatrix)` reused the symmetric-manifold cotangent convention whenever `A` happened to be Hermitian: it projected the cotangent onto the stored (upper) triangle via `_symherm_back`/`triu!`, zeroing the other triangle. For a plain matrix whose entries are all independent free variables this is wrong — it disagrees with ForwardDiff/finite differences and makes the gradient discontinuous as `A` crosses exact symmetry. Concretely, `jacobian(eigvals, A)` returned an erroneous all-zero column for an exactly symmetric `A`, while a matrix one ULP away gave the correct split gradient. `eigen_rev!` already produces (via `_hermitrizelike!`) a cotangent with the off-diagonal eigenvalue sensitivity split evenly across both triangles, so we simply materialise that full matrix instead of collapsing it onto one triangle. Scope: real matrices, eigenvalues only (`T <: Real && ΔV isa AbstractZero`) — i.e. the `eigvals` path. The eigenvector phase convention differs between the symmetric and general algorithms, and the complex-Hermitian case cannot be pinned down against FiniteDifferences (eigenvalues leave the reals), so those paths are unchanged. Updated the dense "hermitian" eigvals/eigen tests to check the eigenvalue gradient of a real matrix against the *unwrapped* `eigvals`/`eigen` (the general-matrix convention); the eigenvector and complex paths keep the `Matrix(Hermitian(·))` reference. Verified `test_rrule(eigvals, A)` now passes for a symmetric `A`, and end-to-end that `jacobian(eigvals, [1 2; 2 3])` matches finite differences. Full factorization (2977) and symmetric (5499) test suites pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
devmotion
reviewed
Jun 15, 2026
| ∂hermA = eigen_rev!(hermA, λ, V, Δλ, ∂V) | ||
| ∂Atriu = _symherm_back(typeof(hermA), ∂hermA, Symbol(hermA.uplo)) | ||
| ∂A = ∂Atriu isa AbstractTriangular ? triu!(∂Atriu.data) : ∂Atriu | ||
| if T <: Real && ΔV isa AbstractZero |
Member
There was a problem hiding this comment.
Why not just limiting this whole branch (on the outer level) to Hermitian and Symmetric{<:Real} instead of ishermitian?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the silent wrong gradient reported in FluxML/Zygote#1369.
Problem
rrule(eigen, A::StridedMatrix)reuses the symmetric-manifold cotangent convention wheneverAhappens to be Hermitian: it pushes the cotangent through_symherm_backandtriu!, projecting it onto the stored (upper) triangle and zeroing the other triangle. That is the right convention for aSymmetric/Hermitianwrapper (where only one triangle is a free variable), but wrong for a plainMatrix, whose entries are all independent.The result is a gradient that disagrees with ForwardDiff/finite differences and is discontinuous at exact symmetry:
Fix
eigen_rev!already produces (via_hermitrizelike!) a cotangent whose off-diagonal eigenvalue sensitivity is split evenly across both triangles. We simply materialise that full matrix instead of collapsing it onto one triangle.Scope: real matrices, eigenvalues only (
T <: Real && ΔV isa AbstractZero), i.e. theeigvalspath. The eigenvector phase convention differs between the symmetric (syev) and general (geev) algorithms, and the complex-Hermitian case can't be validated against FiniteDifferences (the eigenvalues leave the reals, soto_vecmismatches), so those paths are deliberately left unchanged. TheSymmetric/Hermitianwrapper rules insymmetric.jlare untouched and still return structured tangents.Verification
test_rrule(eigvals, Matrix(Hermitian(randn(n, n))))now passes (previously failed on the symmetric input).jacobian(eigvals, [1 2; 2 3])now matches finite differences.eigvals/eigentests so the real eigenvalue gradient is checked against the unwrappedeigvals/eigen(general-matrix convention); the eigenvector and complex paths keep theMatrix(Hermitian(·))reference.factorization.jl(2977) andsymmetric.jl(5499) test suites pass.🤖 Generated with Claude Code