Attempt to add ExecuteDFS to main (Generic depth first search) #737

saffronmciver · 2025-03-04T13:50:55Z

This pull request will build on #505 and #499 and attempt to add ExecuteDFS to the main branch again.

My aim is to add back the fixed versions of functions using the generic DFS, adding more tests to ensure they are correct, and to look at improving the performance of these functions.

I will also look at improving the performance of ExecuteDFS itself.

…calls

saffronmciver · 2025-03-04T15:55:45Z

So far I have refactored the iterative solution of ExecuteDFS to a recursive one. The iterative stack can grow very large, since all successors are pushed to the stack if they have not been visited. if current is the node being visited and successors := [1, 2] is a list of successors to current, and both are unvisited, vertex 2 is pushed to the stack even if it would be visited in the branch where 1 is visited. The iterative solution checks if the current node has been visited and continues the loop if it has to deal with this.

However, this has implications for AncestorFunc and CrossFunc. If (current, 2) is a back edge / cross edge, the relevant function is called before any calls to PreorderFunc, PostorderFunc, etc. are called by visiting the branch of 1. With recursion, AncestorFunc or CrossFunc can easily be called after the recursive call exploring 1 is complete.

I believe removing this behavior and using recursion would also allow forward edges to be recognised during the DFS if this is useful at all.

… to improve performance

…formance of running ExecuteDFS multiple times in a function

… config for DigraphLongestDistanceFromVertex

saffronmciver · 2025-03-26T12:22:07Z

I have found that using the HashMap from the datastructures package is not improving performance (and for many occasions is worsening it). Below are my findings from timing various functions with three different implementations:

Using HashMaps without reserving memory
Using plain lists with a BitArray to store bits for visited / backtracked nodes
Using plain lists without the BitArray (using preorder and postorder to determine visited / backtracked nodes)

`ArticulationPoints` for `CompleteDigraph(10000)`

Implementation	Min Time (ms)	Memory Allocated (Bytes)
`HashMaps`	13,597	925,563,519
`PList` with BitArray	10,255	923,069,485
`PList` no BitArray	10,205	923,070,766

There is not much difference in terms of memory but this is because the current implementation uses much more memory in the non DFS part of the procedure. The HashMap implementation uses 2,099,704 memory for regular DFS on the same graph, and the list implementation uses 323,142. The worse time for HashMaps appears to be mainly caused by lookups. When I was reducing the amount of lookups I noticed significant improvements.

`DigraphTopologicalSort` for `BinaryTree(21)`

Implementation	Min Time (ms)	Memory Allocated (Bytes)
`HashMaps`	1365	469,969,923
`PList` with BitArray	363	107,881,074
`PList` no BitArray	365	107,882,284

All connected components are explored, so all the nodes have information stored in the hashmaps. I think the large memory increase is due to how HashMaps have plain lists for both values and keys, and there are four of these HashMaps in the DFS record. DigraphTopologicalSort uses an AncestorFunc, meaning more lookups are necessary to check if a neighbor is backtracked determining whether to call the AncestorFunc or not. The time increase for HashMaps is worse for this function as a result.

`DominatorTree(d, 1)` for `CompleteDigraph(5000)`

Implementation	Min Time (ms)	Memory Allocated (Bytes)
`HashMaps`	23,642	1,463,632,881
`PList` with BitArray	12,203	1,461,462,329
`PList` no BitArray	12,005	1,461,462,065

This is another case of the DFS memory being dominated by the memory used by the procedure itself (which I may try and look into to see if this can be improved).

`ExecuteDFS` (including the record creation)

I also tested only running ExecuteDFS to remove the interference of the work each function does itself.

On `BinaryTree(21)`

Implementation	Min Time (ms)	Memory Allocated (Bytes)
`HashMaps`	0	1692
`PList` with BitArray	122	67,111,802
`PList` no BitArray	122	67,111,802

This example shows the advantage of HashMaps. Since only one node is explored in the DFS search, the time taken is dominated by the initialisation of data structures rather than the recursive procedure. The whole lists are being initialised, which is not the case for the key and value lists of the HashMap

On `ChainDigraph(20000)`

Implementation	Memory Allocated (Bytes)
`HashMaps`	4,197,140
`PList` with BitArray	642,987

The memory usage is worse for the HashMap due to the two plain lists each HashMap stores. I have not figured out why it is so much more than 2x worse though.

On `BinaryTree(21)` exploring all connected components (`record.config.forest := true`)

Implementation	Min Time (ms)	Memory Allocated (Bytes)
`HashMaps`	1198	402,655,646
`PList` with BitArray	200	67,111,959

As opposed to the case without forest := true, this does force all nodes to be explored (the worst case for HashMaps).

Thoughts on which data structure to choose

I think generally, HashMaps are not preferable to PList for this algorithm. The worst case for PList, where search search starts from a node with no children is probably not a common case, and the worst case of a HashMap seems much worse. There was only a 122 ms difference for the setup of the PList versions with a graph of 2,097,151 vertices (ExecuteDFS for BinaryTree(21)). The memory usage was much higher than for HashMaps, but this only grows with the number of vertices with no further multiplier (the memory usage for HashMaps grows faster due to two PLists per HashMap, but it seems to be more than 2x worse).

I will keep the BitArrays for now in case I remove the usage of some record fields given certain configuration options to reduce memory usage if they are not needed. The memory usage is barely different since only 1 bit is stored per vertex for both the backtracking and visited arrays.

…. ChainDigraph(200000))

…FromVertex)

…ingForest, add options to turn off specific record fields when not necessary

…y record elements

…re tests for regular ExecuteDFS and improve performance using macros for calling PreorderFunc, etc.

… and use_postorder is false, uses a bit array)

saffronmciver · 2025-04-30T19:43:18Z

Below is an overview of the current state of generic DFS performance for selected examples. All benchmarking was done as an average of 100 runs.

General ExecuteDFS Performance

Performance of `ExecuteDFS` on `CompleteDigraph`s

The above graph shows the performance of ExecuteDFS on CompleteDigraphs of increasing size. Benchmarks include the creation of the DFSRecord, since this is always necessary in practice (and is what speeds up mostly when removing fields from the record).

An "all fields" lines refer to use of a DFSRecord with all of use_preorder, use_postorder, use_edge and use_parents being true, and "minimum fields" refers to the least of these being enabled as possible (for iterative, none, for recursive this is only use_parents and use_edge). Iterative with all fields enabled clearly performs worse, and there seems to be an increase in how much worse the "all fields" configurations perform for larger graphs. It is not completely clear whether iterative or recursive is better for "min fields". The recursive procedure is limited by needing to store edge and parents fields. Although, recursive DFS should have a lesser memory footprint (the stack for iterative DFS is unbounded).

Examples for `CompleteDigraphs`

A CompleteDigraph is a particularly bad case for both procedures, since all successors that have not been visited are pushed to the stack in iterative DFS, and recursive DFS calls it's function for each successor. DominatorTree and VerticesReachableFrom appear to show consistent slowdown on size increase of the graph (more runs may be needed to investigate DominatorTree on larger graphs due to the increase when the size is 1500). The slowdown of DominatorTree and VerticesReachableFrom as compared to the currently implemented non generic DFS algorithms is shown below.

Slowdown of `DominatorTree` and `VerticesReachableFrom` for `CompleteDigraph`s

UndirectedSpanningForest has a high slowdown factor for CompleteDigraphs. I am not sure whether replacing this algorithm with generic DFS is a good idea (but would like to further see why this occurs, since this was not the case for DominatorTree or VerticesReachableFrom when using inputs of CompleteDigraph, although I should test with higher sizes of CompleteDigraphs, but the runtimes were becoming unpractically large).

Slowdown of `UndirectedSpanningForest` for `CompleteDigraph`s

Slowdown of `UndirectedSpanningForest` for `BinaryTree`s

The slowdown is much less significant for BinaryTrees which has less of the behaviour where a very large amount of calls
are made for trying to visit vertices, or for iterative (currently being used for this procedure), there are less occurences of large quantities of repeated nodes being pushed to the stack.

Examples for `BinaryTrees`

The slowdown factors for all of DigraphTopologicalSort, IsAntiSymmetricGraph and IsAcyclicGraph is not significant as shown below (when called with BinaryTree).

Slowdown of `UndirectedSpanningForest`, `DigraphTopologicalSort`, `IsAntiSymmetricGraph` and `IsAcyclicGraph` for `BinaryTree`s

Examples for Modified `ChainGraph`s

For the following examples, I modified a ChainDigraph to concatenate 10 of itself into a single graph (so each graph has 10 ChainDigraphs with a root node of 1 that are all connected). This was to allow for an acyclic example that was not completely trivial.

Slowdown of `DigraphPath` and `LongestDistanceFromVertex` for modified `ChainDigraphs`

The slowdown seems relatively consistent when increasing the size of this particular graph type for DigraphPath and LongestDistanceFromVertex

saffronmciver · 2025-04-30T19:54:57Z

Some other notes on the state of this procedure

Further investigation on whether using the recursive or iterative procedure is better might be useful (I think this can just depend on the type of graph, but I would like to understand why CompleteDigraphs are much worse for UndirectedSpanningForest).
I would like to add more tests to some implementations, I have checked through each one and tried to think of any ways they would not be correct, but I think some even more tests would be beneficial.
More tests should be added to ensure behaviour of DFSRecord configurations are correct (epecially combinations of them, and to check ExecuteDFS function behaviour is consistent across them when it should be). These were recently added so I have not got round to adding these tests.
Making sure there are no other procedures existing that are currently using a non generic DFS procedure that would not perform too much worse with generic DFS (and that would be able to be implemented using it)
Pushing backtrack nodes to the iterative stack is not necessary if no PostOrderFunc exists, and if record.config.use_postorder is false.
If there is any way that the code can be made tidier while keeping the configuration options, this would also be good to change so it is more maintainable.
StrongOrientation is currently not implemented for Symmetric Digraphs (uses generic DFS; the version in main is also not implemented for these graphs).

Overview of what has been implemented

Refactor of the original dfs.c file with both an iterative and recursive procedure. The recursive procedure uses tail call optimisation so that it does not stack overflow when doing a deep graph search (e.g. ChainDigraph(2000000)).
The DFSRecord still uses plain lists since I found using HashMaps had too much overhead of hashing for lookups and assignments to justify the reduction in memory usage for graph searches where the graph is only partially traversed.
Added back in each implementation and refactored for the new procedure. Added a specific DFS version of VerticesReachableFrom for a digraph and a list (in addition to the version for a single vertex) that uses record.config.forest_specific.
Added the config field to DFSRecord to specify which record fields to use and therefore assign when calling NewDFSRecord with a specific configuration. Options also exist to visit all components (forest), and to make sure specified vertices are visited (forest_specific). Iterative or recursive can be used by specifying config.iterative. config.revisit means vertices can be revisited. forest configuration options reduce overhead from calling ExecuteDFS multiple times.
Each implementation uses a DFSRecord configuration (record.config) that ensures the least amount of record fields are assigned when they are not needed.
Documentation was updated to reflect the new usage of ExecuteDFS and it's related functions (NewDFSFlags and NewDFSRecord)

saffronmciver added 12 commits March 2, 2025 19:02

Initial DFS Generic, stack now fixed size

4c1a7e4

Merge branch 'dfs' of github.com:saffronmciver/Digraphs into dfs

30b38c5

Fix bit array usage

2703528

Fix last commit

b0a1fe7

Add back vertices reachable from

12f4013

Fix ExecuteDFS and add back some uses

5c7e7b4

Investigate recursive solution to save stack additions

31cd038

Before converting to HashMap

d8be420

Add recursive DFS with HashMaps, fix issue with cross / back edges

d7c2d3b

Fix lint errors

f367047

Fix wrong result from DigraphLongestDistanceFromVertex and CrossFunc …

99955c6

…calls

Fix missed gaplint errors

bf7eb4f

saffronmciver added 11 commits March 4, 2025 16:12

Fix compile warnings

97d981d

Fix include order

091a150

Fix compile warnings (2)

5febb4f

Add back DominatorTree

c9c2da4

Change TopologicalSort to use DFS forest flag, and reserving hashmaps…

8bdf438

… to improve performance

Change IsNotEqualSet to checking IsBound in ExecuteDFS to improve per…

a81fe2f

…formance of running ExecuteDFS multiple times in a function

Remove file

6ebbf44

Add visited / backtrack bitarrays to reduce hash lookups. Add revisit…

7e8d627

… config for DigraphLongestDistanceFromVertex

Remove reserve for hashmaps

0ea0274

List version with bit arrays

9efd4b3

List version without bit arrays

09c9009

saffronmciver added 2 commits March 28, 2025 17:30

Refactor dfs function for tail call optimisation on deeper trees (e.g…

4bb12ef

…. ChainDigraph(200000))

Add an iterative version for revisiting nodes (DigraphLongestDistance…

6856b6d

…FromVertex)

james-d-mitchell added the performance A label for issues or PR related to performance label Apr 7, 2025

saffronmciver added 2 commits April 9, 2025 12:42

Add back (DFS Versions of) IsAntiSymmetricDigraph and UndirectedSpann…

3ecf4de

…ingForest, add options to turn off specific record fields when not necessary

Change args used for ExecuteDFS instances to remove use of unnecessar…

e659124

…y record elements

saffronmciver and others added 12 commits April 9, 2025 13:37

Attempt 1 to make GAP 4.11 pass (changed all uses to iterative

c5110d3

Attempt 2, added CHANGED_BAG

30ab115

Attempt 3, no PLIST usage for out neighbors

4a9a350

Add forest specific (specific vertex start points for forest DFS), mo…

dd002e4

…re tests for regular ExecuteDFS and improve performance using macros for calling PreorderFunc, etc.

Fix lint error

c0fd0ed

Merge branch 'digraphs:main' into dfs

83b6c4b

Correct error on VerticesReachableFrom

d67d5e6

Merge branch 'dfs' of github.com:saffronmciver/Digraphs into dfs

9487572

Add configuration to remove use of record elements (when use_preorder…

cbdc208

… and use_postorder is false, uses a bit array)

Change uses of DFS to remove unnecessary record element usage

7b04099

Add documentation, input checking for DFSFlags, correct error message

1e16a66

Fix documentation spelling

6af3e1b

saffronmciver marked this pull request as ready for review April 30, 2025 12:50

saffronmciver added 2 commits April 30, 2025 15:42

Fix valgrind error with record cleanup for forest DFS

f262e5e

Correct documentation errors

a40ca2a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Attempt to add ExecuteDFS to main (Generic depth first search) #737

Attempt to add ExecuteDFS to main (Generic depth first search) #737

Uh oh!

saffronmciver commented Mar 4, 2025

Uh oh!

saffronmciver commented Mar 4, 2025

Uh oh!

saffronmciver commented Mar 26, 2025

Uh oh!

saffronmciver commented Apr 30, 2025 •

edited

Loading

Uh oh!

saffronmciver commented Apr 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Attempt to add ExecuteDFS to main (Generic depth first search) #737

Are you sure you want to change the base?

Attempt to add ExecuteDFS to main (Generic depth first search) #737

Uh oh!

Conversation

saffronmciver commented Mar 4, 2025

Uh oh!

saffronmciver commented Mar 4, 2025

Uh oh!

saffronmciver commented Mar 26, 2025

ArticulationPoints for CompleteDigraph(10000)

DigraphTopologicalSort for BinaryTree(21)

DominatorTree(d, 1) for CompleteDigraph(5000)

ExecuteDFS (including the record creation)

On BinaryTree(21)

On ChainDigraph(20000)

On BinaryTree(21) exploring all connected components (record.config.forest := true)

Thoughts on which data structure to choose

Uh oh!

saffronmciver commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

General ExecuteDFS Performance

Performance of ExecuteDFS on CompleteDigraphs

Examples for CompleteDigraphs

Slowdown of DominatorTree and VerticesReachableFrom for CompleteDigraphs

Slowdown of UndirectedSpanningForest for CompleteDigraphs

Slowdown of UndirectedSpanningForest for BinaryTrees

Examples for BinaryTrees

Slowdown of UndirectedSpanningForest, DigraphTopologicalSort, IsAntiSymmetricGraph and IsAcyclicGraph for BinaryTrees

Examples for Modified ChainGraphs

Slowdown of DigraphPath and LongestDistanceFromVertex for modified ChainDigraphs

Uh oh!

saffronmciver commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

`ArticulationPoints` for `CompleteDigraph(10000)`

`DigraphTopologicalSort` for `BinaryTree(21)`

`DominatorTree(d, 1)` for `CompleteDigraph(5000)`

`ExecuteDFS` (including the record creation)

On `BinaryTree(21)`

On `ChainDigraph(20000)`

On `BinaryTree(21)` exploring all connected components (`record.config.forest := true`)

saffronmciver commented Apr 30, 2025 •

edited

Loading

Performance of `ExecuteDFS` on `CompleteDigraph`s

Examples for `CompleteDigraphs`

Slowdown of `DominatorTree` and `VerticesReachableFrom` for `CompleteDigraph`s

Slowdown of `UndirectedSpanningForest` for `CompleteDigraph`s

Slowdown of `UndirectedSpanningForest` for `BinaryTree`s

Examples for `BinaryTrees`

Slowdown of `UndirectedSpanningForest`, `DigraphTopologicalSort`, `IsAntiSymmetricGraph` and `IsAcyclicGraph` for `BinaryTree`s

Examples for Modified `ChainGraph`s

Slowdown of `DigraphPath` and `LongestDistanceFromVertex` for modified `ChainDigraphs`

saffronmciver commented Apr 30, 2025 •

edited

Loading