Skip to content

Feature Request: px.distribution_drift, px.model_disagreement, px.quantile_evolution — ML Monitoring Primitives #5603

@samibahig

Description

@samibahig

Feature Request: px.distribution_drift, px.model_disagreement, px.quantile_evolution — ML Monitoring Visualization Primitives

Hi Plotly maintainers,

I'd like to propose three new high-level functions for plotly.express that address a real gap in the ML observability and data science space. These are general-purpose visualization primitives — not domain-specific — that fill gaps common in production ML workflows.

I've built working prototypes with full test coverage and a live interactive demo. Happy to take these through the full contribution process.


1. px.distribution_drift(reference, current, ...)

What it does: Compares two distributions (e.g. training vs. live inference data) with overlapping normalized histograms and a scalar KL-divergence annotation. When drift exceeds a configurable threshold, the current distribution is highlighted in a warning color.

fig = px.distribution_drift(
    reference,                # array-like: baseline samples
    current,                  # array-like: current samples
    bins=50,
    divergence_threshold=0.1,
    reference_name="Reference",
    current_name="Current",
    title=None,
    template=None,
)

Gap it fills: px.histogram with barmode="overlay" gets you close, but there's no built-in divergence scoring, threshold annotation, or drift-aware color logic. This is a one-liner for a very common ML monitoring pattern.


2. px.model_disagreement(x, y, predictions, ...)

What it does: Scatter plot of samples in a 2-D reduced feature space (UMAP/t-SNE/PCA), colored by ensemble variance. Samples above a disagreement threshold are fully opaque; others are dimmed. Includes a marginal histogram of variance scores.

fig = px.model_disagreement(
    x,                        # dim 1 of reduced space (per sample)
    y,                        # dim 2 of reduced space
    predictions,              # shape (n_samples, n_models) — ensemble preds
    threshold=0.05,
    colorscale="Viridis",
    title=None,
    template=None,
)

Gap it fills: px.scatter with color= handles the spatial encoding, but computing ensemble variance, dimming low-uncertainty samples, and combining with a marginal histogram requires significant boilerplate. This surfaces a critical active-learning and model-audit pattern as a single call.


3. px.quantile_evolution(timestamps, p50, *, p10, p90, p25, p75, ...)

What it does: Layered ribbon/band chart tracking P10–P90 and P25–P75 bands with P50 as a bold center line, over time or cohorts. Automatically annotates timestamps where the spread exceeds a volatility threshold.

fig = px.quantile_evolution(
    timestamps,               # x-axis: ISO strings, labels, or numbers
    p50=median_values,        # required: median (center line)
    p10=p10_values,           # optional: outer lower band
    p90=p90_values,           # optional: outer upper band
    p25=p25_values,           # optional: IQR lower bound
    p75=p75_values,           # optional: IQR upper bound
    mean=mean_values,
    show_mean=False,
    volatility_multiplier=1.5,
    title=None,
    template=None,
)

Gap it fills: Building fill-between ribbon plots requires chaining 5+ go.Scatter traces with careful fill="tonexty" sequencing. This is error-prone and undiscoverable. A single px call makes this pattern accessible to the full data science audience.


Why these belong in plotly.express

  • Each returns a go.Figure — composable and customizable after the fact
  • They accept lists, numpy arrays, and pandas Series via a thin _to_list wrapper
  • They use only existing trace types (go.Histogram, go.Scatter) — no new trace types needed
  • They follow the existing px design philosophy: one call, sensible defaults, everything overridable

Working prototype

Zero dependencies beyond plotly itself. 12-test suite passes on Python 3.9–3.12.

I'm ready to adapt the implementation to Plotly's internal conventions, add API docs, write tests in the tests/test_core/test_px/ format, and address any design feedback on the signatures.

Thank you for considering it!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions