refactor(alert): move AHDC track-finding from AHDCEngine to ALERTEngine#1242
refactor(alert): move AHDC track-finding from AHDCEngine to ALERTEngine#1242mathieuouillon wants to merge 7 commits into
Conversation
|
Note: AHDC::hits.adc is stored as int (calibrated ADC truncated). Track.get_sum_adc rounds Hit.getADC() per hit, so when track-finding ran inside AHDCEngine it summed the original full-precision double. Reading back from the bank here gives integer adc values, so sum_adc and dEdx in AHDC::track can drift by 0-1 per hit relative to the pre-refactor output (≈1-6 counts in sum_adc on ~1% of events). Eliminating the drift would require widening AHDC::hits.adc from I to F/D in the schema. |
|
Compared
GNN finds ~3.9× more tracks overall. Per-event breakdown (1000 events)
|
26ba2cb to
d09f721
Compare
|
Tested on 100 files for the run 22991, no error or warning with clara |
187e956 to
f21c966
Compare
f21c966 to
dc61b1e
Compare
| @@ -0,0 +1,34 @@ | |||
| package org.jlab.rec.ahdc.TrackFinding; | |||
|
|
|||
| import org.jlab.rec.ahdc.Track.Track; | |||
There was a problem hiding this comment.
The track finder doesn't find Tracks it finds TrackCandidates.
A Track s something different (ie the result of track fitting). A candidate should have hits associated with it and can be specialized: AHDC only, AHDC+ATOF, AHDC+verex, etc. Which then dictates the specifics of how the fitting is done.
There was a problem hiding this comment.
Thanks for all the comments. I started work on that.
For that, it is a very good idea.
My first draft for that comment is in the commit: 28dc412
Verified byte-identical: recon-util on clas_021903 (config_p0v9, MLP_Track_Finding, 1000 events) before/after gives 0 mismatched rows / 0 mismatched entries across AHDC::track, AHDC::kftrack, AHDC::hits, AHDC::clusters, AHDC::interclusters, AHDC::preclusters and AHDC::docaclusters
There was a problem hiding this comment.
We should have skeleton engine ready to go called ALERTEventBuilderEngine as a way of developing reconstruction code that will eventually be merged into the event builder. It will be the last engine in the processing chain until it is all merged into the event builder.
There was a problem hiding this comment.
It is a good idea, but it should probably be in a different pull request
There was a problem hiding this comment.
Since Cluster is an overloaded term. It would be a good idea to rename this class AHDCCluster. Or be explicit with interfaces using clusters
|
|
||
| public enum ModeTrackFinding { | ||
| AI_Track_Finding, | ||
| MLP_Track_Finding, |
There was a problem hiding this comment.
Should be TrackFindingMode not ModeTrackFinding
| MLP_Track_Finding, | ||
| CV_Distance, | ||
| CV_Hough, | ||
| GNN_Track_Finding, |
There was a problem hiding this comment.
These need to be documented -- add a short description of each. What does CV stand for?
There was a problem hiding this comment.
The TrackFinding methods probably shouldn't be located in rec/ahdc but rather rec/alert
There was a problem hiding this comment.
The file should be named TrackFindingMode and it should be located with TrackFinding which should be in rec/alert not rec/ahdc
| // --- ATOF nodes ------------------------------------------------------------- | ||
| // Deduplicate by (sector, layer, component) — inference-time variant of the | ||
| // Python dedup which also keys on track id (only needed at training time). | ||
| if (atofHitsBank != null) { |
There was a problem hiding this comment.
This leads me to think this directory rec/ahdc/AI should not be in rec/ahdc, but in rec/alert
| * Exported forward signature (see SingleGraphEdgeScorer): | ||
| * forward(x: float32[N, 10], edge_index: int64[2, E], edge_attr: float32[E, 9]) | ||
| * -> float32[E] (sigmoid edge scores in [0, 1]) | ||
| */ |
There was a problem hiding this comment.
Link to DJL's documentation somewhere.
| /** Track extraction from per-edge scores via union-find connected components | ||
| * at a single threshold. Ports the {@code method="cc"} branch of | ||
| * {@code track-finding/gnn/inference.py::extract_tracks}, which is the | ||
| * extractor that gnn/evaluate.py uses. |
AHDCEngine now only reads AHDC::adc, applies calibration via HitReader, and writes AHDC::hits. The full track-finding pipeline (preclustering, AI/CV_Distance/CV_Hough finder, DOCA refinement, helix fit) runs in ALERTEngine on top of AHDC::hits + ATOF::hits/clusters, alongside the existing projection / matching / prePID / Kalman steps. The track finder is selected via the ALERT.Mode YAML key (was AHDC.Mode); ModelTrackFinding only loads when AI_Track_Finding is selected. The ATOF::tdc gate now fires after the AHDC pipeline so events without ATOF still get their AHDC::* banks, matching the pre-refactor behavior.
…erface
Introduce TrackFinder { findTracks(hits) -> TrackFinderResult } with three implementations — AITrackFinder, DistanceTrackFinder, HoughTrackFinder — each owning its own preclustering, cluster building, and mode-specific logic. AITrackFinder owns ModelTrackFinding, the MAX_HITS_FOR_AI Distance fallback, and the greedy non-overlap selection; the "too many candidates" exit becomes TrackFinderResult.invalid() instead of a return-false from processDataEvent. ALERTEngine becomes a thin dispatcher: init() picks the strategy from ALERT.Mode via a switch, and processDataEvent calls findTracks(hits) once. Output is byte-identical to the prior refactor (same 9/999 sum_adc/dEdx precision drift, no new mismatches).
Introduce GNN_Track_Finding as a fourth track-finding mode alongside the renamed MLP_Track_Finding (was AI_Track_Finding), CV_Distance, and CV_Hough. The new path runs a GravNet edge scorer (TorchScript via DJL) on a per-event AHDC + ATOF hit graph, extracts tracks as connected components on edges with sigmoid score >= 0.1, then re-preclusters each surviving track's AHDC hits and pairs them into per-superlayer Clusters so the existing DOCA refinement + helix fit + Kalman stages consume them unchanged. Selected via ALERT.Mode in YAML. MLP regression is bit-identical (same pre-existing AHDC::track sum_adc/ dEdx precision drift); only COAT::config changes, reflecting the renamed mode.
…tching prediction handling
dc61b1e to
36f990b
Compare
…Track (fit result) The AHDC track finders produced org.jlab.rec.ahdc.Track.Track, and the helix fit + Kalman filter then mutated that same object in place. One class was doing two jobs: a track-finder output (hits + clusters) and a fit result (vertex, momentum, chi2). A "Track" should mean the result of track fitting. Split the conflated class into two: - TrackCandidate: the track-finder output. Owns hits, clusters, and interclusters. Carries a CandidateType (AHDC_ONLY / AHDC_ATOF, plus a reserved AHDC_VERTEX) describing its specialization; the type is what will dictate how the candidate is fitted. The two old Track constructors (from Clusters, from Hits) move here. - Track: the fit result, produced by fitting a TrackCandidate. Composes the candidate it was fitted from and adds the fitted vertex, momentum, chi2, path, dEdx, p_drift, sum_residuals. It stays a full facade: every accessor the old Track exposed still works, and the candidate-side ones are delegated to the underlying TrackCandidate. All four finders (MLP / Distance / Hough / GNN) plus Distance and HoughTransform now produce TrackCandidate; TrackFinderResult wraps List<TrackCandidate>. ALERTEngine's fit stage turns each candidate into a Track, with a switch (CandidateType) dispatch seam.
AHDCEngine now only reads AHDC::adc, applies calibration via HitReader, and writes AHDC::hits. The full track-finding pipeline (preclustering, AI/CV_Distance/CV_Hough finder, DOCA refinement, helix fit) runs in ALERTEngine on top of AHDC::hits + ATOF::hits/clusters, alongside the existing projection / matching / prePID / Kalman steps.
The track finder is selected via the ALERT.Mode YAML key (was AHDC.Mode); ModelTrackFinding only loads when AI_Track_Finding is selected. The ATOF::tdc gate now fires after the AHDC pipeline so events without ATOF still get their AHDC::* banks, matching the pre-refactor behavior.