perf: Speed up vertex/edge sequence construction (~10× for large result sets)#2696
perf: Speed up vertex/edge sequence construction (~10× for large result sets)#2696schochastics wants to merge 12 commits into
Conversation
Vertex/edge sequences eagerly materialized their `names` (and edge `vnames`) character vectors at construction time. For functions that return many sequences (e.g. max_cliques returning tens of thousands), this allocates a named character vector per object even when the names are never read. Add an `igraph_lazy_names` ALTREP string class (src/rinterface_extra.c) that wraps the graph's full name vector plus a 1-based index, and only materializes the strings when an element is touched (printing, named indexing, as_ids()). Subsetting stays lazy via an Extract_subset method that composes indices. V() and E() now attach names through the lazy_index_names() helper instead of subsetting eagerly; downstream constructors (create_vs/simple_vs_index/...) inherit laziness through the Extract_subset path with no changes. Names resolve identically to before; vertex/edge sequence printing, named indexing and as_ids() are unaffected. Note: this removes the name-materialization penalty (named sequences now cost about the same as unnamed), but it is not by itself sufficient to close the gap to the numeric (return.vs.es = FALSE) path -- that is dominated by per-object graph-reference attachment in add_vses_graph_ref(), addressed separately. cpp11::cpp_register() also drops two dead registrations (Rx_igraph_weak_ref_run_finalizer, UUID_gen) that no R code .Call()s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a benchmark group exercising the sequence-construction path on named graphs, where building the `names`/`vnames` attribute and attaching the graph reference dominate: max_cliques (tens of thousands of vertex sequences), head_of over every edge, and V()/E() on a large named graph. These guard the lazy-names work and any future construction-path changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Constructing a vertex/edge sequence attached a graph reference per object via add_vses_graph_ref(), which calls .Call(Rx_igraph_copy_env), .Call(Rx_igraph_make_weak_ref) and .Call(Rx_igraph_get_graph_id) for every object. For functions returning many sequences (e.g. max_cliques returning tens of thousands) this dominated construction time -- profiling showed it was ~75% of the cost, far more than name building. The weak reference's key is the graph's environment, which is identical for every sequence of a graph (Rf_duplicate() is a no-op on an environment, so get_vs_ref() returns the same env each call). A single shared weak reference is therefore semantically identical to one per object: while the graph is alive the reference resolves, and once the graph is released the (weak) reference reports it gone -- verified that get_vs_graph() still returns NULL after rm(graph); gc(). simple_es_index() already propagates env/graph from its input, so edge construction only needed the redundant per-object add_vses_graph_ref() call removed. simple_vs_index() now propagates env/graph the same way, and unsafe_create_vs()/unsafe_create_es() rely on that propagation. The existing `lapply(res, unsafe_create_vs, graph = graph, verts = V(graph))` call sites then share the single weak reference built by V()/E() with no call-site or codegen changes. max_cliques(sample_gnp(500, 0.15)) on a named graph: ~338ms -> ~200ms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two construction-cost reductions on top of the shared weak reference: * simple_vs_index() now sets names/class/env/graph in a single `attributes<-` call instead of separate `attr<-`/`class<-` assignments. Each incremental assignment shallow-copies the vector, and that copying dominated when building many sequences. * unsafe_create_vs() no longer routes through simple_vs_index(). Its `idx` are vertex IDs from C and `verts` is always the full V(graph), so `verts[idx]` just reproduces `idx`; we now use the IDs directly as the (integer) payload and take names lazily from `verts` via the ALTREP's O(1) subsetting, avoiding a full copy of V(graph) per object. Payload type (integer), names, NA handling and graph recovery are unchanged. max_cliques(sample_gnp(500, 0.15)) on a named graph: ~200ms -> ~140ms (was ~338ms before this branch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Functions returning many vertex sequences used `lapply(res, unsafe_create_vs, graph = graph, verts = V(graph))`, which re-read the graph reference, graph id and name source from `verts` on every object. Add `create_vs_list(graph, idx_list)` which hoists all of that per-graph work out of the loop and builds each sequence with a single `attributes<-` and a directly-constructed lazy-names ALTREP, so per-object cost drops to ~an integer coercion plus attribute set. The `VERTEXSET_LIST` OUTCONV template in tools/stimulus/types-RR.yaml now emits `create_vs_list(graph, res)`; the generated R/aaa-*.R files are updated to match (27 sites), along with the hand-written call sites in cliques.R, cohesive.blocks.R, components.R, conversion.R, interface.R, paths.R and structural-properties.R (10 sites). `unsafe_create_vs()` stays as the single-object form. Edge sequences are left as-is: simple_es_index() already propagates the shared weak reference from a single E(graph), so unsafe_create_es() does not pay the per-object cost. A lazy `vnames` and an es batch form remain as follow-ups. max_cliques(sample_gnp(500, 0.15)) on a named graph: ~140ms -> ~100ms; ~338ms -> ~100ms over the whole branch (numeric floor ~25-30ms). Output, names, NA handling and graph recovery are unchanged; 0 test failures across the vertex-sequence-returning suites and aaa-auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a benchmark group exercising the three distinct wins on named graphs: * ego_order2_named -- batch construction of one vertex sequence per node (create_vs_list via neighborhood()). * max_cliques_sizes_named -- build many cliques but only read their sizes; with lazy names the name vectors are never materialized. * all_simple_paths_named -- another high-volume vertex-sequence-list path. * vs_subset_positional -- O(1) lazy subsetting of a large named vertex sequence (ALTREP Extract_subset) instead of copying a name subset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
create_vs_list() drove the per-element work (lapply closure, as.integer, .Call to build the lazy-names ALTREP, and attributes<-) from R, repeated once per sequence -- the dominant cost when functions like max_cliques() return tens of thousands of igraph.vs objects. Move that loop into Rx_igraph_vs_list() in rinterface_extra.c, which reuses the file-static igraph_lazy_names ALTREP class directly. R now only builds V(graph) once (to mint the shared weak reference and graph id) and hands the pieces to C. Each element gets a fresh integer payload (guarded against coerceVector aliasing so the caller's vectors are never mutated), an optional lazy-names attribute, the shared env/graph attributes, and the igraph.vs class. max_cliques(sample_gnp(500, 0.15)) on a named graph (medians, 12 iters): vs: ~100ms -> 34.5ms (numeric floor ~30ms) Correctness verified: integer payloads with identical values/names, no names on unnamed graphs, NA/out-of-range IDs map to NA_STRING, inputs unmutated, and the shared weakref still releases the graph after rm()+gc(). Full suite passes (iterators, cliques, paths, components, flow, conversion, interface, structural-properties, cycles, attributes, aaa-auto); clean under gctorture. cpp11::cpp_register() added only the Rx_igraph_vs_list registration; it left the previously-noted unused entries untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 21f3d4b is merged into main:
|
R CMD check on R 4.5 flagged a non-API call to `DATAPTR`, used by the `igraph_lazy_names` ALTREP Dataptr method. Switch to `DATAPTR_RO` (and keep `DATAPTR_OR_NULL`), both of which are part of the API and are the sanctioned replacements; `DATAPTR` is on tools:::nonAPI, these are not. The materialized name cache is read-only for our purposes (as.vector(), coercion, printing), so a read-only data pointer is sufficient. Verified the compiled object no longer references the `DATAPTR` entry point and that names/as_ids/as.vector/subsetting/printing still work. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 79fadf6 is merged into main:
|
| consistent API. | ||
| } | ||
| \section{Related documentation in the C library}{ | ||
| \href{https://igraph.org/c/html/0.10.17/igraph-Structural.html#igraph_biconnected_components}{\code{biconnected_components()}}, \href{https://igraph.org/c/html/0.10.17/igraph-Basic.html#igraph_vcount}{\code{vcount()}}, \href{https://igraph.org/c/html/0.10.17/igraph-Basic.html#igraph_edges}{\code{edges()}}, \href{https://igraph.org/c/html/0.10.17/igraph-Basic.html#igraph_get_eids}{\code{get_eids()}}, \href{https://igraph.org/c/html/0.10.17/igraph-Basic.html#igraph_ecount}{\code{ecount()}} |
There was a problem hiding this comment.
should this be part of this diff?
There was a problem hiding this comment.
not sure why this is a diff here
There was a problem hiding this comment.
maybe it disappears when rerunning document
| benchmark_run( | ||
| expr_before_benchmark = { | ||
| library(igraph) | ||
| set.seed(42) |
There was a problem hiding this comment.
why don't we use withr here? I'm genuinely curious.
There was a problem hiding this comment.
Claude not a fan, maybe? I nudge to it 😆
There was a problem hiding this comment.
Ah I think because these scripts are supposed to be quite dependency free? All that is needed is touchstone and igraph
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 07598b9 is merged into main:
|
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if bd02062 is merged into main:
|
|
Conflicting now. |
part of #2695. I went down the ALTREP hole and while it is possible to get near the
return.vs.es = FALSEresult, it does add quite some complexity to the code. Lets discuss if we want this or not. maybe some stuff has to go to the C core?Summary
Functions that return many vertex sequences (
max_cliques(),cliques(),ego(), shortest-pathvpaths,all_simple_paths(), separators, cohesive blocks) were slow on named graphs because each returned sequence paid a fixed per-object tax. This PR removes that tax.max_cliques(sample_gnp(500, 0.15))on a named graph (37k cliques) locally:mainreturn.vs.es = FALSE)The default (vs objects) is now within ~4 ms of returning bare integer indices.
The finding (and how it differs from the issue)
The motivation in #1614 / #1652 assumed the cost was
create_vs()building thenamesattribute, and the fix was lazy/ALTREP names. Profiling says otherwise: lazy names was the smallest of the three contributors (~10%). The real costs were.Calls per sequence — ~75% of the time), andThe weak-reference turned out to be the same for every sequence of a graph (
Rf_duplicate()is a no-op on an environment), so one shared reference replaces one-per-object with no change in lifetime semantics. The rest came from building the payload directly, setting attributes in one pass, and finally constructing the whole list in a single C call.This reframes #1652. The performance gap that blocked deprecating
return.vs.es(the worst case stibu81 called "unusable" was an 8× gap) is now ~1.15×. For typical workloads it's negligible. The option's main remaining justification is gone, so it can be deprecated and left inert rather than urgently removed.What changed
names/edge-names are a lazy ALTREP string vector (igraph_lazy_names) that materializes only when read, and subsets in O(1).create_vs_list(), backed by a C constructor (Rx_igraph_vs_list) that builds the whole list in one pass.VERTEXSET_LISTstimulus template emitscreate_vs_list(); the generatedaaa-*.Rfiles were regenerated to match (verified by re-running stimulus).max_cliques_named,head_of_named,V_named/E_named) and the specific wins (ego_order2_named,max_cliques_sizes_named,all_simple_paths_named,vs_subset_positional).Correctness
Output is unchanged: integer payloads with the same values and names, no
namesattribute on unnamed graphs, NA/out-of-range IDs map toNA_STRING, and inputs are never mutated. The shared weak reference still releases the graph:get_vs_graph(seq)returnsNULLafterrm(graph); gc(). The full suite passes (iterators, cliques, paths, components, flow, conversion, interface, structural-properties, cycles, attributes, aaa-auto), and the C code is clean undergctorture(TRUE).Scope and follow-ups
Vertex sequences only. Edge sequences already share the weak reference via
simple_es_index(), but theirvnames("tail|head") are still built eagerly; making those lazy and adding a Ccreate_es_list()is left as a follow-up. Thereturn.vs.esoption and its codegen guards are untouched here — its deprecation is a separate decision, now much better supported.Relates to #1652; complements the consistency fix in #1614 / #2439.