Skip to content

perf: hash-adjacency overhaul#2421

Open
jrgemignani wants to merge 1 commit intoapache:masterfrom
jrgemignani:perf_VLE_hash_adjacency
Open

perf: hash-adjacency overhaul#2421
jrgemignani wants to merge 1 commit intoapache:masterfrom
jrgemignani:perf_VLE_hash_adjacency

Conversation

@jrgemignani
Copy link
Copy Markdown
Contributor

Replaces the per-graph adjacency map with a Robin Hood open-addressing hashtable (agehash) and an embedded flat-array edge list, removing the hottest dynahash path on IC1 and shrinking the largest hashtable AGE keeps. Stages land as one commit:

S1 MurmurHash3 fmix64 for graphid hashtables (replaces tag_hash)
S2 Precompute graphid hash; share across paired DFS lookups
S3 Replace ListGraphId adjacency with embedded flat-array
VertexEdgeArray (single palloc, contiguous iteration)
S4 Batched MLP lookup pipeline in add_valid_vertex_edges
S5/C1 agehash library: INLINE Robin Hood hashtable with
_with_hash API, freeze, iter, and a regress-only selftest
S5/C2 Wire global graph edge_hashtable through agehash;
drop edge_id from edge_entry (key lives in slot header);
AGEHASH_MAX_LOAD=0.85; MemoryContextAllocHuge for SF10+

Performance (SF3 LDBC SNB, 5 runs/3 warmup, vs clean master baseline_v2):

IC1 8,625 → 7,117 ms −17.49 % (the headline; hashtable-bound)
IU1 40 → 35 ms −11.86 % (heaviest update; lookup-bound)
IC sum 198,958 → 197,367 ms −0.80 % (suite-level noise)
IS sum 1,009 → 1,028 ms +1.86 % (IS3 jitter; sub-ms)
IU sum 77 → 72 ms −6.64 %
IC2/3/4/5/6/7/8/9/10/11/12: parity (within ±3.3 %, mostly ±1.5 %)

The VLE-DFS-heavy queries (IC3/5/6/9/11) sit at parity: with hash_search_with_hash_value at ≤1 % inclusive on their baseline flames, no hashtable swap can recover meaningful wall-time on them.

Memory: removing edge_id from edge_entry saves ~416 MB on SF3 and ~1.4 GB on SF10 for the global graph's edge_hashtable. Slot capacity uses MemoryContextAllocHuge so SF10+ edge tables can be built.

Adds:
src/backend/utils/cache/agehash.c, src/include/utils/agehash.h
regress/sql/agehash.sql + expected/agehash.out (boundary selftest)
_agehash_self_test() in both fresh-install and upgrade SQL

Tested on PostgreSQL 18.3 (REL_18_STABLE): all 35 regression tests pass (installcheck), warning-free build.

Co-authored-by: Claude noreply@anthropic.com

modified: Makefile
modified: age--1.7.0--y.y.y.sql
new file: regress/expected/agehash.out
new file: regress/sql/agehash.sql
modified: sql/age_main.sql
modified: src/backend/utils/adt/age_global_graph.c
modified: src/backend/utils/adt/age_vle.c
new file: src/backend/utils/cache/agehash.c
modified: src/include/utils/age_global_graph.h
new file: src/include/utils/agehash.h

Replaces the per-graph adjacency map with a Robin Hood open-addressing
hashtable (agehash) and an embedded flat-array edge list, removing the
hottest dynahash path on IC1 and shrinking the largest hashtable AGE
keeps. Stages land as one commit:

  S1  MurmurHash3 fmix64 for graphid hashtables (replaces tag_hash)
  S2  Precompute graphid hash; share across paired DFS lookups
  S3  Replace ListGraphId adjacency with embedded flat-array
      VertexEdgeArray (single palloc, contiguous iteration)
  S4  Batched MLP lookup pipeline in add_valid_vertex_edges
  S5/C1  agehash library: INLINE Robin Hood hashtable with
         _with_hash API, freeze, iter, and a regress-only selftest
  S5/C2  Wire global graph edge_hashtable through agehash;
         drop edge_id from edge_entry (key lives in slot header);
         AGEHASH_MAX_LOAD=0.85; MemoryContextAllocHuge for SF10+

Performance (SF3 LDBC SNB, 5 runs/3 warmup, vs clean master baseline_v2):

  IC1   8,625 → 7,117 ms   −17.49 %   (the headline; hashtable-bound)
  IU1      40 →    35 ms   −11.86 %   (heaviest update; lookup-bound)
  IC sum     198,958 → 197,367 ms     −0.80 %   (suite-level noise)
  IS sum       1,009 →   1,028 ms     +1.86 %   (IS3 jitter; sub-ms)
  IU sum          77 →      72 ms     −6.64 %
  IC2/3/4/5/6/7/8/9/10/11/12: parity (within ±3.3 %, mostly ±1.5 %)

The VLE-DFS-heavy queries (IC3/5/6/9/11) sit at parity: with
hash_search_with_hash_value at ≤1 % inclusive on their baseline
flames, no hashtable swap can recover meaningful wall-time on them.

Memory: removing edge_id from edge_entry saves ~416 MB on SF3 and
~1.4 GB on SF10 for the global graph's edge_hashtable. Slot capacity
uses MemoryContextAllocHuge so SF10+ edge tables can be built.

Adds:
  src/backend/utils/cache/agehash.c, src/include/utils/agehash.h
  regress/sql/agehash.sql + expected/agehash.out (boundary selftest)
  _agehash_self_test() in both fresh-install and upgrade SQL

Tested on PostgreSQL 18.3 (REL_18_STABLE): all 35 regression tests
pass (installcheck), warning-free build.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   Makefile
modified:   age--1.7.0--y.y.y.sql
new file:   regress/expected/agehash.out
new file:   regress/sql/agehash.sql
modified:   sql/age_main.sql
modified:   src/backend/utils/adt/age_global_graph.c
modified:   src/backend/utils/adt/age_vle.c
new file:   src/backend/utils/cache/agehash.c
modified:   src/include/utils/age_global_graph.h
new file:   src/include/utils/agehash.h
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new high-performance open-addressing hashtable (agehash) and rewires AGE’s global-graph edge lookup + VLE traversal hot paths to reduce dynahash overhead and improve cache locality.

Changes:

  • Add agehash (INLINE Robin Hood hashtable) plus SQL/regress self-test coverage.
  • Replace global graph edge_hashtable (dynahash) with edge_table (agehash) and remove edge_id from the edge payload (key lives in the slot header).
  • Replace per-vertex linked-list adjacency with embedded flat arrays and batch VLE edge lookups using precomputed graphid_hash.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/include/utils/agehash.h Defines the agehash public/internal API contract, slot layout, and helper macros.
src/backend/utils/cache/agehash.c Implements INLINE Robin Hood hashtable + iterator + SQL-callable self-test.
src/include/utils/age_global_graph.h Introduces VertexEdgeArray, adds get_edge_entry_with_hash, and exposes graphid_hash/graphid_keyeq.
src/backend/utils/adt/age_global_graph.c Switches global edge storage to agehash, embeds vertex adjacency arrays, implements graphid_hash, and updates accessors/freeing.
src/backend/utils/adt/age_vle.c Reuses precomputed graphid hashes across paired lookups and batches adjacency processing for better MLP.
sql/age_main.sql Adds SQL declaration for _agehash_self_test() for fresh installs.
age--1.7.0--y.y.y.sql Adds _agehash_self_test() to the extension upgrade script.
regress/sql/agehash.sql Adds regression test invoking the agehash self-test.
regress/expected/agehash.out Expected output for the new agehash regression test.
Makefile Builds agehash.o and adds agehash to the regression test list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +141 to +143
t->payload_size = (uint32) payload_size;
t->payload_offset = AGEHASH_SLOT_KEY_OFFSET + (uint32) key_size;
t->slot_size = MAXALIGN(t->payload_offset + (uint32) payload_size);
Comment on lines +905 to +907
* value. The dynahash table keyed on graphid is shared with edge_hashtable
* elsewhere, so callers can compute graphid_hash() once and reuse it for
* lookups in both tables.
char carry_payload[4096];
void *result_payload = NULL;
bool placed_caller = false;

Assert(payload_size > 0 && payload_size <= 4096);
Assert(hash_fn != NULL);
Assert(keyeq_fn != NULL);

Comment on lines +305 to +308
* Probe distance overflow guard. With AGEHASH_MAX_LOAD = 0.7 and a
* non-degenerate hash function, max probe is empirically <= 32.
* The 0xFE00 ceiling reserves headroom while leaving probe_dist
* well clear of the AGEHASH_EMPTY sentinel.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants