Skip to content

Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files#2621

Open
schochastics wants to merge 2 commits intomainfrom
categories
Open

Split generated R/aaa-auto.R into per-category R/aaa-<cat>.R files#2621
schochastics wants to merge 2 commits intomainfrom
categories

Conversation

@schochastics
Copy link
Copy Markdown
Contributor

Summary

Stimulus generates a single ~14,800-line R/aaa-auto.R containing every C igraph wrapper. This PR introduces a categorization layer that splits that monolithic output into 26 per-category files (R/aaa-basicigraph.R, R/aaa-cliques.R, …, R/aaa-visitors.R) so navigating the generated wrappers aligns with how igraph groups functions in its reference manual. Subcategories appear inside each file as banner comments.

Why do this? Today a developer grepping for bfs_impl lands in the middle of a 14.8k-line file with no navigational cues; afterwards they land at the top of R/aaa-visitors.R under the # ==== breadth-first-search ==== banner.

The split happens as a post-processing step on the stimulus output; stimulus itself is unchanged (it doesn't support multi-file output natively). A new tools/aaa-categories.yaml is the single source of truth for which function goes where, and two new tools keep everything reconciled.

What's in the diff

Change Purpose
tools/aaa-categories.yaml (new) Authoritative map: category → subcategory → list of igraph_* C functions. 491 entries across 26 categories, covering every R_igraph_* symbol .Call()'d in the generated wrappers.
tools/rebuild-cats.R (new) Reconciles the YAML against whatever R/aaa-*.R files are present. Idempotent; fails loudly if an ungrouped function appears in the generated wrappers.
tools/split-aaa-auto.R (new) Parses the stimulus output, looks up each _impl wrapper's category, and writes one file per category with subcategory banners. Preserves each wrapper's source byte-for-byte.
Makefile-cigraph Stimulus now writes to .build/aaa-auto.R (ignored), and the split script produces the in-repo R/aaa-<cat>.R files. New phony target r_wrappers covers the full pipeline.
R/aaa-auto.RR/aaa-<cat>.R × 26 The actual split output. All existing .Call() semantics unchanged — it is a purely organizational change.
.gitignore / .Rbuildignore Ignore .build/.

The closure-normalization rule

Nine .Call() targets in the generated wrappers end in _closure (e.g. R_igraph_bfs_closure). These are R-binding helpers defined in src/rcallback.c that wrap an underlying C function with SEXP-callback support — they are not standalone C library functions. rebuild-cats.R encodes the 9-entry whitelist and maps them back to their semantic names (e.g. igraph_bfs_closureigraph_bfs) so each wrapper lands where a reader would expect. R_igraph_transitive_closure is not affected — there "closure" is a graph-theory term, not a wrapper suffix.

Categorization highlights

The initial YAML layout mirrored igraph's legacy docbook sections. Several cleanups were applied:

  • Retired the undocumented category — all 8 entries moved to real homes:
    • igraph_residual_graph, igraph_reverse_residual_graphflows/maximum-flows
    • igraph_hrg_sample_manyhrg/hrg-sampling
    • igraph_has_attribute_table, igraph_finalizernongraph/internal
    • igraph_eigen_adjacencystructural/spectral-properties
    • igraph_eigen_matrix, igraph_eigen_matrix_symmetric, igraph_solve_lsapnongraph/linear-algebra (new subcategory)
  • Typo/case fixes: regular-structre-generatorsregular-structure-generators; Sparsifierssparsifiers; motifs/uncategorizedmotifs/graph-census.
  • Semantic relocations: igraph_transitive_closure and igraph_transitive_closure_dag moved from structural/graph-componentsoperators/miscellaneous-operators (they produce a derived graph, not component analysis).
  • Split oversized buckets:
    • structural/shortest-path-related-functions (34 entries) → distances-and-metrics (22) + shortest-paths (12).
    • structural/other-operations (11) → matrix-representations (5) + mutual-edges (3) + summary-statistics (3).

Developer workflow

After a stimulus upgrade or new igraph C function landing upstream:

make -f Makefile-cigraph r_wrappers   # regenerates the split files
Rscript tools/rebuild-cats.R          # validates/updates the categories YAML

The second step fails loudly with the exact names that need adding if aaa-categories.yaml drifts from the generated wrappers.

Validation performed

  • All 26 R/aaa-*.R files parse cleanly.
  • 490 _impl wrappers distributed across the files, zero duplicates.
  • 491 unique R_igraph_* symbols preserved (the 491st being R_igraph_finalizer, which appears in every impl's on.exit but has no wrapper of its own).
  • tools/rebuild-cats.R produces byte-identical output on re-run (idempotent).
  • tools/split-aaa-auto.R produces byte-identical output on re-run from the same source (idempotent).

Test plan

  • devtools::load_all(".") succeeds
  • R CMD check / CI passes
  • make -f Makefile-cigraph r_wrappers round-trips cleanly on a machine with the stimulus venv
  • Spot-check that at least one wrapper from each of the 26 category files still behaves correctly (e.g. the existing testthat suite covers igraph_* wrappers broadly, so a green test run is the main check)

cc @maelle for review — this is purely an organizational/tooling change; no behavior should change, but the restructuring is substantial so a second pair of eyes on the categorization choices would be welcome.

🤖 Generated with Claude Code

schochastics and others added 2 commits April 23, 2026 14:34
Stimulus generates one monolithic R/aaa-auto.R (~14.8k lines) covering
every C igraph wrapper. This commit introduces a categorization layer that
splits the generated output into 26 per-category files matching how the
functions are grouped in the igraph reference manual, with subcategory
banner comments inside each file.

- tools/aaa-categories.yaml: authoritative category -> subcategory -> fn
  mapping, reconciled against every R_igraph_* symbol .Call()'d from the
  generated wrappers (491 entries; 8 closure wrappers mapped back to their
  underlying C functions via the src/rcallback.c whitelist)
- tools/rebuild-cats.R: idempotent reconciliation tool; fails loudly if
  new functions appear in the generated wrappers without a categorization
- tools/split-aaa-auto.R: post-processes stimulus output into R/aaa-<cat>.R
- Makefile-cigraph: stimulus now writes to .build/aaa-auto.R (ignored), the
  split script produces the in-repo R/ files. Phony target r_wrappers
  covers the full pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant