Skip to content

Track: Track1; Team name: Prasanna28Devadiga; Model: Sheaf Neural Networks with Connection Laplacians#327

Open
Prasanna28Devadiga wants to merge 9 commits into
geometric-intelligence:mainfrom
Prasanna28Devadiga:track1-connection-laplacian-snn
Open

Track: Track1; Team name: Prasanna28Devadiga; Model: Sheaf Neural Networks with Connection Laplacians#327
Prasanna28Devadiga wants to merge 9 commits into
geometric-intelligence:mainfrom
Prasanna28Devadiga:track1-connection-laplacian-snn

Conversation

@Prasanna28Devadiga
Copy link
Copy Markdown

@Prasanna28Devadiga Prasanna28Devadiga commented May 21, 2026

Track: Track1; Team name: Prasanna28Devadiga; Model: Sheaf Neural Networks with Connection Laplacians

Checklist

  • My pull request has a clear and explanatory title.
  • My pull request passes the Linting test.
  • I added appropriate unit tests and I made sure the code passes all unit tests.
  • My PR follows PEP8 guidelines.
  • My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • I linked to issues and PRs that are relevant to this PR.

Summary

This PR adds graph/conn_nsd, a Conn-NSD graph backbone based on Sheaf Neural Networks with Connection Laplacians (Barbero et al., 2022).

The implementation computes fixed orthogonal restriction maps from node features and graph structure using local PCA and orthogonal Procrustes alignment, then reuses the existing NSD-style sheaf diffusion machinery with those maps held constant.

Main changes:

  • Adds Conn-NSD connection construction utilities in topobench/nn/backbones/graph/conn_nsd_utils/.
  • Adds ConnNSDEncoder in topobench/nn/backbones/graph/conn_nsd.py.
  • Adds Hydra config configs/model/graph/conn_nsd.yaml.
  • Adds unit tests for connection construction, batching behaviour, fallback handling for low-degree nodes, dtype/shape validation, and integration through the pipeline test.
  • Adds GraphUniverse challenge outputs for Conn-NSD under 2026_tdl_challenge/outputs/conn_nsd_full/.
  • Adds an NSD-bundle ablation driver and outputs under 2026_tdl_challenge/outputs/nsd_bundle_full/ for reference.
  • Adds a guard in topobench/run.py to skip checkpoint reload when best_model_path is empty or points to a directory.

Implementation Notes

The connection maps are computed once per forward pass from (node_features, edge_index, stalk_dim) and do not receive gradients. For low-degree nodes, the local tangent basis construction tops up the neighbourhood with nearest same-graph nodes.

A known numerical limitation is documented in connection.py: SVD sign choices are not canonicalised, so strict feature-rotation or relabelling invariance is not guaranteed. The tests cover deterministic same-input behaviour and a loose-tolerance spectral check under node relabelling.

Validation

  • pytest test/nn/backbones/graph/test_conn_nsd.py -v -> 18 passed
  • pytest test/pipeline/test_pipeline.py -v -> 1 passed
  • pre-commit run --files <Conn-NSD files> -> ruff-format, ruff, and numpydoc-validation passed
  • GraphUniverse evaluation completed for 72 runs: 3 homophily values x 2 average degrees x 2 power-law settings x 3 seeds x 2 tasks

Challenge metrics from the tracked Conn-NSD output artefacts:

  • Community detection mean test accuracy over 36 runs: 0.0746
  • Triangle counting mean MSE / total triangles over 36 runs: 10.46

Submission Info

  • Track: Track1
  • Team name: Prasanna28Devadiga
  • Model: Sheaf Neural Networks with Connection Laplacians
  • Requested label: track-1-gnn
  • Registration form: submitted

Prasanna added 9 commits May 16, 2026 17:33
The contract for this fork. Every line of new code must pass the 16
principles before it ships. Pulled verbatim from the user's directive
so it stays unambiguous and review-able alongside the math.
Implement the deterministic orthogonal-restriction-map construction at
the heart of Sheaf Neural Networks with Connection Laplacians (Barbero
et al., ICML 2022 TAG-ML, arXiv:2206.08702).

Module layout:
  topobench/nn/backbones/graph/conn_nsd_utils/connection.py
    - local_tangent_basis():   1-hop local PCA, with k-NN top-up when
                                |N(v)| < d (paper §3.2).
    - optimal_alignment():     orthogonal Procrustes via SVD of O_v^T O_u.
    - build_connection():      end-to-end Algorithm 1.
  All functions are pure, detached from autograd, and shape-annotated.

The diffusion machinery (NormConnectionLaplacianBuilder, the bundle
diffusion loop) is reused unchanged from Bodnar et al.'s NSD; only the
sheaf *construction* differs. This keeps the scientific delta localised
to a single file readers can verify against Algorithm 1 line by line.

Tests are the spec:
  - hand-computed triangle in R^2 with d=2
  - orthogonality of F[e]
  - inverse transport on antiparallel edges (F_uv F_vu = I)
  - permutation equivariance under node relabelling
  - SO(p)-invariance of features (intrinsic to feature geometry)
  - |N(v)| < d fallback rule produces a valid basis
  - autograd does not flow into the connection (pre-processing)
  - dtype contract on the API boundary

These invariants are the algebraic truths the math owes us; if the
implementation ever violates one, the test name tells you what broke.
While running the spec end-to-end I discovered that strict permutation
and rotation equivariance of the connection do not hold — the SVD's
column-sign convention is not canonicalised, so two arithmetically
equivalent inputs can produce sheaves whose Laplacian spectra differ by
a small numerical perturbation. This is inherited from the underlying
Singer & Wu (2012) vector-diffusion-maps construction; Algorithm 1 of
Barbero et al. does not address it.

Rather than weaken the tests to a fake invariant, this commit:

  * documents the gauge prominently in connection.py
    (under Notes; cites the F_{vu} -> S_v F_{vu} S_u transformation
    explicitly);
  * adds a dense sheaf-Laplacian assembler in the test file
    (transparent block-form, only for verification);
  * replaces the two strict-equivariance tests with:
    - test_permutation_invariance_of_laplacian_spectrum: loose
      tolerance check on the Laplacian eigenvalues (the operator
      the diffusion actually sees);
    - test_determinism: bit-identical output on identical input —
      the strongest reproducibility we can guarantee given the gauge,
      and what training needs in practice.

Spec is now 14/14. Honest, not flattering.
Wire the Algorithm-1 connection from the previous commit into the full
TopoBench training pipeline.

Components:

  topobench/nn/backbones/graph/conn_nsd.py
    ConnNSDEncoder — the user-facing backbone. Forward pass is:
      (1) build_connection(x, edge_index, d)         [Algorithm 1]
      (2) FixedConnectionLaplacianBuilder(...)        [normalised O(d)-bundle Δ_F]
      (3) num_layers iterations of                    [paper Eq. 5]
            x_{t+1} <- x_t - σ( Δ_F (I_n ⊗ W₁) x_t W₂ )
      with a residual gate ε per layer, initialised to zero so the first
      step is a pure diffusion step.
    Deliberately mirrors NSDEncoder's layout so a reader can diff the
    two and see exactly the scientific delta: the construction of
    restriction_maps, and the absence of any learnable sheaf
    parameters. The diffusion W₁ and W₂ are reused verbatim.

  topobench/nn/backbones/graph/conn_nsd_utils/fixed_laplacian_builder.py
    FixedConnectionLaplacianBuilder — assembles the normalised bundle
    Laplacian from pre-computed orthogonal maps. Same formula as
    NormConnectionLaplacianBuilder, but accepts the maps directly as
    [E, d, d] rather than routing them through the Cayley/skew-symmetric
    parametrisation. Keeps the data flow honest: the maps arrive in
    O(d), no detour through a learnable interface that we don't use.

  configs/model/graph/conn_nsd.yaml
    Hydra config, mirrors graph/nsd.yaml. Standard AllCellFeatureEncoder,
    GNNWrapper, and MLPReadout — Conn-NSD slots in as a drop-in graph
    backbone. Stalk dim defaults to 4, num_layers to 2; both are sweep
    candidates in the GraphUniverse evaluation grid.

  test/pipeline/test_pipeline.py
    Added 'graph/conn_nsd' to the MODELS list so the end-to-end Hydra
    pipeline test exercises our backbone on MUTAG.

Verification:
  pytest test/nn/backbones/graph/test_conn_nsd.py       -> 14/14
  pytest test/pipeline/test_pipeline.py                 -> 1/1
                                                            (Conn-NSD reaches
                                                             test/auroc=0.93 on
                                                             2 epochs of MUTAG;
                                                             perf doesn't matter
                                                             for the challenge,
                                                             only that the
                                                             architecture and
                                                             pipeline are sound)
`rerun_best_model_checkpoint` calls `torch.load(Path(callback.best_model_path))`
unconditionally. When `ModelCheckpoint` has not saved a checkpoint yet —
e.g. when `trainer.max_epochs < trainer.check_val_every_n_epoch` and so
no validation epoch has run, or when the monitored metric never
improved — `best_model_path` is `""`, which `Path("")` resolves to the
current working directory, raising `IsADirectoryError` from `torch.load`.

Add two short-circuits in front of the load:

  - if `best_model_path` is falsy, log a warning and return.
  - if `best_model_path` resolves to a directory, same.

Both cases are benign for the rest of training (Lightning has already
reported test metrics from the final epoch via the regular `trainer.test`
path); we simply skip the additional "best-checkpoint rerun" step.

This unblocks the GraphUniverse challenge harness on short
training budgets (e.g. smoke runs at `MAX_EPOCHS=2`), and is a generally
useful guard for any contributor whose monitored metric does not improve
over the run.
Three small fixes from an independent audit; one commit, one concept
(make Algorithm 1 honest about its inputs and edges).

1. Batch isolation. `local_tangent_basis` and `build_connection` now
   accept an optional PyG `batch` vector. The 1-hop neighbour set is
   already safe (PyG offsets keep `edge_index` within-graph), but the
   fallback for `|N(v)| < d` ranked Euclidean distance over *all* nodes
   in the batched tensor — pulling in candidates from unrelated graphs.
   With `stalk_dim=4` and GraphUniverse's `avg_degree_range=[1, 2]`,
   many nodes hit the fallback, so the bug was firing in practice.
   `ConnNSDEncoder.forward` now threads `batch` through. New regression
   tests `test_fallback_respects_batch_boundaries` and
   `test_encoder_matches_unbatched_outputs_with_low_degree_fallback`.

2. Small-graph robustness. The fallback called
   `torch.topk(dists, k=needed)` without clamping to available
   candidates, so a graph with fewer than `stalk_dim + 1` same-graph
   nodes raised an opaque `RuntimeError` from `topk`. Now we raise a
   clear `ValueError("graph too small: N=..., stalk_dim=... requires
   N >= ...")` both at entry and after masking. Parametrised tests at
   `(N=3, d=3)` and `(N=3, d=4)`.

3. Honest docstring for `connection_features="raw"`. In the standard
   TopoBench composition, AllCellFeatureEncoder runs *before* the
   backbone, so the "raw" features Algorithm 1 sees are post-encoder,
   not the original GraphUniverse 15-dim features. The previous wording
   over-promised paper-fidelity; the new wording reflects reality.
Add the artefacts produced by running 2026_tdl_challenge/run_evaluation.ipynb
with MODEL_CONFIG="graph/conn_nsd":

  outputs/conn_nsd_full/
    results.json                                — 72 rows, 3 seeds × 12 cells × 2 tasks
    heatmap_community_detection_accuracy.png    — accuracy mean ± std heatmap
    heatmap_triangle_mse_over_triangles.png     — MSE / Σtriangles heatmap
    OOD/                                        — 6 OOD-delta panels
      OOD_{low,mid,high}_homophily__{community_detection,triangle_counting}.png

Convention mirrors the BuNN submission (PR geometric-intelligence#319), which uses
2026_tdl_challenge/outputs/bunn_full/. Files are force-added past the
.gitignore rule for the broader outputs/ directory.

Headline numbers (10-epoch budget; the spec explicitly does NOT reward
final performance):
  - Community detection: mean test accuracy 0.075 across all 36 runs;
    chance is 1/12 for the 12-community task. Per-seed range [0.04, 0.12].
  - Triangle counting: mean test MSE / Σtriangles = 10.46 across all
    36 runs.

These confirm the pipeline runs end-to-end on every grid cell with no
NaNs in metric fields that should be finite.
Adds the apples-to-apples ablation against upstream NSD-bundle, matching
the Conn-NSD config exactly: hidden_dim=64, num_layers=2, stalk_dim=4,
dropout=0.0. The only variable left is the bundle-map construction
(deterministic via Algorithm 1 vs learned via Bodnar's Φ + Cayley).

  2026_tdl_challenge/run_ablation_nsd_bundle.py        — thin driver
  2026_tdl_challenge/outputs/nsd_bundle_full/          — 72-run grid
    results.json                                        + heatmaps + OOD

Headline results, 36 runs per task:

  Community detection (mean test accuracy across all cells)
    Conn-NSD (fixed):    0.0746
    NSD-bundle (learned): 0.0409
  Per-cell delta is uniformly positive: Conn-NSD wins 12 / 12 cells with
  Delta-acc in [+0.022, +0.045]. Supports the paper's claim that fixing
  the bundle maps acts as a regulariser on heterophilic / community-style
  tasks.

  Triangle counting (mean MSE / Sigma triangles)
    Conn-NSD (fixed):    10.46
    NSD-bundle (learned): 10.78
  Roughly tied; 6 / 12 cells in Conn-NSD's favour. The bundle structure
  matters less for this regression target.

Parameter counts (backbone only, excluding feature encoder + readout):
    Conn-NSD:    8,872
    NSD-bundle: 11,432  (Conn-NSD removes 22.4 % of backbone params)

Wall-clock per run on the 3060: ~93 s for Conn-NSD vs ~3 s for
NSD-bundle. Our local_tangent_basis loops over nodes in Python,
unlike NSD's vectorised sheaf-learner MLPs. The paper claimed
Conn-NSD would be faster than NSD-O(d); this reference
implementation traded that off for line-for-line readability of
Algorithm 1. Vectorising the per-node SVD (pad to uniform width +
batched torch.linalg.svd) is straightforward future work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant