degree optimization by mj3cheun · Pull Request #1641 · graphistry/pygraphistry

mj3cheun · 2026-05-25T20:13:46Z

Summary

Refactor get_degrees / get_indegrees / get_outdegrees to reduce wide-nodes-frame passes and eliminate the get_outdegrees rename-trick. Targeted at large workloads where peak memory and distributed shuffle counts dominate, but improvements hold at all scales. Made using learnings from graphistry efforts.

Motivation

get_degrees previously composed get_indegrees().get_outdegrees(), each of which:

did a groupby on edges to produce a small per-node aggregate, and
merged that aggregate into the wide nodes frame.

That's two wide merges per call. get_outdegrees additionally built a throwaway Plottable over a renamed-columns edges copy.

At billion-row scale, the wide merge is the dominant cost (memory and distributed shuffle). Cutting two wide merges to one is the structural win.

Approach

Introduce _degree_agg(edges, key_col, out_name, node_id) returning a small (node_id, count) frame. Then:

get_indegrees(col) and get_outdegrees(col) route through a shared _single_direction_degree(key_col, col) helper. get_outdegrees no longer renames+rebuilds a Plottable; it groups by _source directly.
get_degrees calls _degree_agg twice to produce |V|-row aggregates, outer-merges them small × small, computes the degree column on the narrow frame, then performs a single merge into the nodes frame.

Net change per get_degrees call: two wide merges → one. On dask_cudf, distributed shuffles 2 → 1.

Behavior changes

Two intentional, both narrow:

Null-endpoint counting (bug fix). Old get_indegrees used .agg({src: "count"}) (non-null count); now uses .size() (row count). An edge (null → b) now contributes 1 to b.degree_in (was 0). Symmetric for get_outdegrees. This aligns with the graph-theoretic definition of degree and fixes a latent inconsistency: b could appear as a materialized node yet report degree_in == 0 despite having a row in the edges table.
- Only affects datasets with rows where exactly one of src/dst is null and the other is valid.
get_outdegrees row order. Now returns nodes in natural materialize_nodes order (consistent with get_indegrees and get_degrees). Master returned them in a reversed order that was an artifact of the rename trick. No documented contract pinned the old order.

Files

graphistry/compute/ComputeMixin.py — refactor (+~60 / -38 net)
graphistry/tests/test_compute.py — update test_degrees_out to natural ordering; add test_degrees_with_null_endpoint regression test for the null-endpoint counting change
CHANGELOG.md — entry under Development → Changed

Risk / caveats

Standalone get_indegrees / get_outdegrees can regress modestly in a narrow regime: very sparse graphs (|V| ≫ |E|) with very few node columns. get_degrees itself stays faster in that regime; the standalone-helper regression is bounded constant-factor and disappears as soon as edges or node columns scale up.
dask_cudf numbers unmeasured. Local benchmark is pandas single-node. The shuffle-reduction win should be the largest gain in production but is not measured here.
Null-endpoint behavior change is a fix, not a configurable flag. Datasets with single-null-endpoint edges will see degree numbers shift upward. Likely surfaces a latent bug rather than breaking an intentional contract.

Test plan

graphistry/tests/test_compute.py — passes (including updated test_degrees_out and new test_degrees_with_null_endpoint)
graphistry/tests/compute/test_get_degrees_cudf.py — passes (cuDF tests skipped locally; need TEST_CUDF=1 in CI to verify GPU path)
graphistry/tests/compute/test_id_column_restriction.py — passes (custom column name coverage)
dask_cudf integration — relies on existing safe_merge routing; no path-specific tests
Manual: re-run benchmark on a cuDF / dask_cudf billion-row dataset before claiming production magnitudes

mj3cheun · 2026-05-25T20:15:11Z

Benchmark results -- updated

Methodology. Two engines: pandas (single-node) and cuDF (single GPU, RAPIDS 26.02). 5 iterations per shape after warmup, median reported. Memory via tracemalloc (pandas) and rmm.statistics.peak_bytes (cuDF). Synthetic graphs with uniform random edges. Output equivalence verified (modulo row order) before timing every shape.

Pandas get_degrees

Shape	Time (new / old)	Peak memory (new / old)
100K V × 1M E, 4 cols	1.94x faster	3.84x lower
100K V × 1M E, 20 cols (wide nodes)	1.76x faster	1.70x lower
500K V × 2M E, 4 cols	1.12–1.23x faster	1.20x lower
1M V × 100K E, 4 cols (sparse)	1.04x faster	1.11x lower
100K V × 5M E, 4 cols (edge-heavy)	2.17x faster	1.79x lower

cuDF get_degrees (10M+ scale, RTX-class GPU)

Shape	Time (new / old)	Peak memory (new / old)
1M V × 10M E, 4 cols	0.96x (par)	1.78x lower
1M V × 10M E, 20 cols	1.01x	2.07x lower
5M V × 25M E, 20 cols	1.03x	1.78x lower
10M V × 50M E, 4 cols	1.01x	1.77x lower
10M V × 50M E, 20 cols	1.03x	1.78x lower
10M V × 100M E, 4 cols	1.02x	1.78x lower

Standalone helpers

get_outdegrees improves consistently on both engines (the rename-trick Plottable rebuild is gone): pandas 1.20–2.39x faster + 1.27–3.17x lower memory; cuDF parity time + 1.77x lower memory.

get_indegrees is essentially equivalent to master on cuDF (within ±5% time, memory at parity). On pandas it regresses ~25–30% in the sparse narrow regime (1M V × 100K E), bounded constant-factor — get_degrees itself stays faster on the same shape.

Pattern

Time gains correlate with edge count. Edge-heavy shapes (5M edges) hit 2x+ on pandas. Sparse shapes track at parity.
Memory gains correlate with node-frame width. Wide nodes (20 cols) reach 1.70x–2.07x lower across both engines.
cuDF time is at parity because GPU groupby/merge are bandwidth-bound, not allocation-bound; the structural fix shows up as memory savings rather than wall-clock.
value_counts() is more efficient than groupby().size() on both engines for this per-direction aggregation — using it widened the wins across the board and eliminated a cuDF-specific get_indegrees memory regression seen with .size().

lmeyerov · 2026-05-25T21:19:45Z

+
+        if _safe_len(g._edges) == 0:
+            nodes_df = g_nodes._nodes
+            for c in (degree_in, degree_out, col):


Can collapse the assign, moving loop to inside

This helps in turn cut the layers of DFs that pandas makes

thanks for suggestion, done

lmeyerov · 2026-05-25T21:21:31Z

@mj3cheun for typical case we care about, it would be cudf to benchmark, not pd, right?

mj3cheun · 2026-05-26T00:00:51Z

yup cudf

mj3cheun · 2026-05-26T01:01:51Z

thanks for pushing back, although the structural changes were good there was a performance regression in cudf (and also in pandas tho not noticeable) due to choice of method used in _degree_agg

this has been fixed and benchmark results above updated. both the pandas and cudf results are greatly improved

lmeyerov

Awesome , tx

Approved

If I was going to be paranoid, I might ask Claude to do test amplification around alias issues and indexes

mj3cheun added 2 commits May 25, 2026 15:35

optimize get degree calls

fd30b6c

update changelog

c04e6c3

mj3cheun changed the title ~~Dev/degree optimization~~ degree optimization May 25, 2026

run lint

9dc8569

mj3cheun requested a review from lmeyerov May 25, 2026 20:47

lmeyerov reviewed May 25, 2026

View reviewed changes

clean up assign loop + more tests

e02af13

fix _degree_agg performance regression

0269d98

lmeyerov approved these changes May 26, 2026

View reviewed changes

Merge branch 'master' into dev/degree-optimization

30f14d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

degree optimization#1641

degree optimization#1641
mj3cheun wants to merge 6 commits into
masterfrom
dev/degree-optimization

mj3cheun commented May 25, 2026 •

edited

Loading

Uh oh!

mj3cheun commented May 25, 2026 •

edited

Loading

Uh oh!

lmeyerov May 25, 2026

Uh oh!

mj3cheun May 26, 2026

Uh oh!

lmeyerov commented May 25, 2026

Uh oh!

mj3cheun commented May 26, 2026

Uh oh!

mj3cheun commented May 26, 2026

Uh oh!

lmeyerov left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mj3cheun commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Approach

Behavior changes

Files

Risk / caveats

Test plan

Uh oh!

mj3cheun commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results -- updated

Uh oh!

lmeyerov May 25, 2026

Choose a reason for hiding this comment

Uh oh!

mj3cheun May 26, 2026

Choose a reason for hiding this comment

Uh oh!

lmeyerov commented May 25, 2026

Uh oh!

mj3cheun commented May 26, 2026

Uh oh!

mj3cheun commented May 26, 2026

Uh oh!

lmeyerov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mj3cheun commented May 25, 2026 •

edited

Loading

mj3cheun commented May 25, 2026 •

edited

Loading