Skip to content

refactor(alert): move AHDC track-finding from AHDCEngine to ALERTEngine#1242

Open
mathieuouillon wants to merge 7 commits into
developmentfrom
rgl_trackfinding_gnn
Open

refactor(alert): move AHDC track-finding from AHDCEngine to ALERTEngine#1242
mathieuouillon wants to merge 7 commits into
developmentfrom
rgl_trackfinding_gnn

Conversation

@mathieuouillon
Copy link
Copy Markdown
Collaborator

AHDCEngine now only reads AHDC::adc, applies calibration via HitReader, and writes AHDC::hits. The full track-finding pipeline (preclustering, AI/CV_Distance/CV_Hough finder, DOCA refinement, helix fit) runs in ALERTEngine on top of AHDC::hits + ATOF::hits/clusters, alongside the existing projection / matching / prePID / Kalman steps.

The track finder is selected via the ALERT.Mode YAML key (was AHDC.Mode); ModelTrackFinding only loads when AI_Track_Finding is selected. The ATOF::tdc gate now fires after the AHDC pipeline so events without ATOF still get their AHDC::* banks, matching the pre-refactor behavior.

@mathieuouillon
Copy link
Copy Markdown
Collaborator Author

Note: AHDC::hits.adc is stored as int (calibrated ADC truncated). Track.get_sum_adc rounds Hit.getADC() per hit, so when track-finding ran inside AHDCEngine it summed the original full-precision double. Reading back from the bank here gives integer adc values, so sum_adc and dEdx in AHDC::track can drift by 0-1 per hit relative to the pre-refactor output (≈1-6 counts in sum_adc on ~1% of events). Eliminating the drift would require widening AHDC::hits.adc from I to F/D in the schema.

@mathieuouillon
Copy link
Copy Markdown
Collaborator Author

Compared AHDC::track row counts between the two finders on the same 1000-event input (clas_021903.evio.00000, recon-util -n 1000), toggling only the ALERT.Mode YAML key.

Metric MLP GNN
Total tracks 9 35
Mean tracks / event 0.009 0.035
Events with ≥1 track 9 34

GNN finds ~3.9× more tracks overall.


Per-event breakdown (1000 events)

Bucket Count
Both finders have tracks 9
Only MLP has tracks 0
Only GNN has tracks 25
Neither has tracks 966
MLP rows > GNN rows (per event) 0
MLP rows < GNN rows (per event) 25
MLP rows == GNN rows 975

@mathieuouillon mathieuouillon force-pushed the rgl_trackfinding_gnn branch 2 times, most recently from 26ba2cb to d09f721 Compare May 11, 2026 12:24
@mathieuouillon mathieuouillon marked this pull request as ready for review May 11, 2026 12:24
@mathieuouillon
Copy link
Copy Markdown
Collaborator Author

Tested on 100 files for the run 22991, no error or warning with clara

@@ -0,0 +1,34 @@
package org.jlab.rec.ahdc.TrackFinding;

import org.jlab.rec.ahdc.Track.Track;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The track finder doesn't find Tracks it finds TrackCandidates.
A Track s something different (ie the result of track fitting). A candidate should have hits associated with it and can be specialized: AHDC only, AHDC+ATOF, AHDC+verex, etc. Which then dictates the specifics of how the fitting is done.

Copy link
Copy Markdown
Collaborator Author

@mathieuouillon mathieuouillon May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the comments. I started work on that.
For that, it is a very good idea.
My first draft for that comment is in the commit: 28dc412

Verified byte-identical: recon-util on clas_021903 (config_p0v9, MLP_Track_Finding, 1000 events) before/after gives 0 mismatched rows / 0 mismatched entries across AHDC::track, AHDC::kftrack, AHDC::hits, AHDC::clusters, AHDC::interclusters, AHDC::preclusters and AHDC::docaclusters

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have skeleton engine ready to go called ALERTEventBuilderEngine as a way of developing reconstruction code that will eventually be merged into the event builder. It will be the last engine in the processing chain until it is all merged into the event builder.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea, but it should probably be in a different pull request

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Cluster is an overloaded term. It would be a good idea to rename this class AHDCCluster. Or be explicit with interfaces using clusters

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


public enum ModeTrackFinding {
AI_Track_Finding,
MLP_Track_Finding,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be TrackFindingMode not ModeTrackFinding

Comment on lines +4 to +7
MLP_Track_Finding,
CV_Distance,
CV_Hough,
GNN_Track_Finding,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to be documented -- add a short description of each. What does CV stand for?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TrackFinding methods probably shouldn't be located in rec/ahdc but rather rec/alert

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file should be named TrackFindingMode and it should be located with TrackFinding which should be in rec/alert not rec/ahdc

// --- ATOF nodes -------------------------------------------------------------
// Deduplicate by (sector, layer, component) — inference-time variant of the
// Python dedup which also keys on track id (only needed at training time).
if (atofHitsBank != null) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leads me to think this directory rec/ahdc/AI should not be in rec/ahdc, but in rec/alert

* Exported forward signature (see SingleGraphEdgeScorer):
* forward(x: float32[N, 10], edge_index: int64[2, E], edge_attr: float32[E, 9])
* -&gt; float32[E] (sigmoid edge scores in [0, 1])
*/
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to DJL's documentation somewhere.

Comment on lines +8 to +11
/** Track extraction from per-edge scores via union-find connected components
* at a single threshold. Ports the {@code method="cc"} branch of
* {@code track-finding/gnn/inference.py::extract_tracks}, which is the
* extractor that gnn/evaluate.py uses.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this code?

AHDCEngine now only reads AHDC::adc, applies calibration via HitReader, and writes AHDC::hits. The full track-finding pipeline (preclustering, AI/CV_Distance/CV_Hough finder, DOCA refinement, helix fit) runs in ALERTEngine on top of AHDC::hits + ATOF::hits/clusters, alongside the existing projection / matching / prePID / Kalman steps.

The track finder is selected via the ALERT.Mode YAML key (was AHDC.Mode); ModelTrackFinding only loads when AI_Track_Finding is selected. The ATOF::tdc gate now fires after the AHDC pipeline so events without ATOF still get their AHDC::* banks, matching the pre-refactor behavior.
…erface

Introduce TrackFinder { findTracks(hits) -> TrackFinderResult } with three implementations — AITrackFinder, DistanceTrackFinder, HoughTrackFinder — each owning its own preclustering, cluster building, and mode-specific logic. AITrackFinder owns ModelTrackFinding, the MAX_HITS_FOR_AI Distance fallback, and the greedy non-overlap selection; the "too many candidates" exit becomes TrackFinderResult.invalid() instead of a return-false from processDataEvent. ALERTEngine becomes a thin dispatcher: init() picks the strategy from ALERT.Mode via a switch, and processDataEvent calls findTracks(hits) once. Output is byte-identical to the prior refactor (same 9/999 sum_adc/dEdx precision drift, no new mismatches).
Introduce GNN_Track_Finding as a fourth track-finding mode alongside the renamed MLP_Track_Finding (was AI_Track_Finding), CV_Distance, and CV_Hough. The new path runs a GravNet edge scorer (TorchScript via DJL) on a per-event AHDC + ATOF hit graph, extracts tracks as connected components on edges with sigmoid score >= 0.1, then re-preclusters each surviving track's AHDC hits and pairs them into per-superlayer Clusters so the existing DOCA refinement + helix fit + Kalman stages consume them unchanged.
Selected via ALERT.Mode in YAML. MLP regression is bit-identical (same pre-existing AHDC::track sum_adc/ dEdx precision drift); only COAT::config changes, reflecting the renamed mode.
@mathieuouillon mathieuouillon force-pushed the rgl_trackfinding_gnn branch from dc61b1e to 36f990b Compare May 18, 2026 17:13
…Track (fit result)

The AHDC track finders produced org.jlab.rec.ahdc.Track.Track, and the helix fit + Kalman filter then mutated that same object in place. One class was doing two jobs: a track-finder output (hits + clusters) and a fit result (vertex, momentum, chi2). A "Track" should mean the result of track fitting.

Split the conflated class into two:

- TrackCandidate: the track-finder output. Owns hits, clusters, and interclusters. Carries a CandidateType (AHDC_ONLY / AHDC_ATOF, plus a reserved AHDC_VERTEX) describing its specialization; the type is what will dictate how the candidate is fitted. The two old Track constructors (from Clusters, from Hits) move here.

- Track: the fit result, produced by fitting a TrackCandidate. Composes the candidate it was fitted from and adds the fitted vertex, momentum, chi2, path, dEdx, p_drift, sum_residuals. It stays a full facade: every accessor the old Track exposed still works, and the candidate-side ones are delegated to the underlying TrackCandidate.

All four finders (MLP / Distance / Hough / GNN) plus Distance and HoughTransform now produce TrackCandidate; TrackFinderResult wraps List<TrackCandidate>. ALERTEngine's fit stage turns each candidate into a Track, with a switch (CandidateType) dispatch seam.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants