Skip to content

Profile-likelihood analysis (D2D-style identifiability + confidence intervals) #446

Description

@wshlavacek

Summary

Add profile-likelihood analysis to PyBNF for parameter identifiability and confidence intervals,
following the Data2Dynamics (D2D) methodology (Raue/Schilling/Timmer et al.). Profile likelihood is
the D2D-preferred uncertainty-quantification method and is robust to non-Gaussian, non-elliptical
confidence regions in a way the Fisher-information / covariance approximation is not.

Depends on the gradient plumbing (#385) and is a natural companion to the gradient-based optimizers
(#386); it reuses the same residual Jacobian and local-optimizer machinery.

What profile likelihood does

For each fitted parameter θ_k, fix it to a grid of values around the optimum and re-optimize all
the other parameters
at each grid point, tracing the profile of the objective
χ²_PL(θ_k) = min_{θ_{j≠k}} χ²(θ). The confidence interval for θ_k is the range where the profile
stays below a threshold (Δχ² at the chosen confidence level, χ² quantile with 1 dof). Flat profiles
diagnose structural non-identifiability; profiles that rise on only one side diagnose
practical non-identifiability.

examples/becker_d2d_gradient/ already implements this end-to-end against BNGsim (one profile curve
per fitted parameter on the Becker model) and is the reference for the expected outputs and plots.

Current state

Scope

  1. Profile driver. Given a completed fit (optimum θ*, bounds, objective), for each requested
    parameter sweep it across a grid (adaptive step in log10 space, D2D-style) and re-optimize the
    remaining parameters at each point using a Gradient-based local optimizers (D2D-style: TRF/Levenberg-Marquardt + L-BFGS-B) in the async Algorithm loop #386 local optimizer, warm-started from the previous
    grid point. Stop each direction when the profile crosses the Δχ² threshold or hits a bound.
  2. Confidence intervals + identifiability classification. From each profile, report the CI at a
    configurable confidence level and classify the parameter as identifiable / practically
    non-identifiable / structurally non-identifiable.
  3. Outputs + plots. Per-parameter profile curves (Δχ² vs θ_k), threshold lines, and a tabular
    CI / classification summary; serialized so a run can be resumed/extended.
  4. Config + driver integration. A .conf switch to request profiles after a fit completes
    (which parameters, confidence level, grid density, max re-opt evals per point), gated on the
    gradient path being available (BNGSIM_HAS_OUTPUT_SENS).

Edge cases and design considerations

  • Cost. Profiling is O(n_params × grid_points) full re-optimizations; warm-starting from the
    neighboring grid point and capping per-point evals keeps it tractable. Parallelize across
    parameters (independent) through the existing scheduler.
  • Bounds. A profile that reaches a bound without crossing the threshold is an open (one-sided) CI
    — report it as such rather than clamping silently.
  • Parameter scale. Profile in the same sampling space the optimizer uses (priors/scale.py,
    ADR-0029); thresholds are on the objective, not the transformed parameter.
  • Relation to Save best gdat/scan files #202-style FIM. The Fisher-information/covariance estimate (a separate, linearized
    approximation) can seed initial profile step sizes but does not replace the profile.

Deliverables

  • Profile-likelihood driver re-optimizing non-profiled parameters per grid point via a Gradient-based local optimizers (D2D-style: TRF/Levenberg-Marquardt + L-BFGS-B) in the async Algorithm loop #386
    local optimizer, warm-started, threshold/bound-terminated.
  • CI extraction + identifiability classification per parameter.
  • Profile-curve plots + a CI/classification summary table; serialized, resumable.
  • .conf integration (parameters, confidence level, grid density, per-point eval cap), gated on
    the gradient path.
  • Parallelization across parameters through the existing scheduler.
  • Tests: recover known CIs on a small identifiable model; correctly flag a deliberately
    non-identifiable parameter; agreement with examples/becker_d2d_gradient/ on a small subset.
  • User-guide section on running and interpreting profiles.

Reference

A. Raue et al., Structural and practical identifiability analysis of partially observed dynamical
models by exploiting the profile likelihood
, Bioinformatics 25(15):1923–1929 (2009).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions