-
Notifications
You must be signed in to change notification settings - Fork 2
FAQ
Common questions and answers about GraphBrew.
GraphBrew is a graph processing benchmark framework that combines:
- 17 algorithm IDs (0-16): 15 reordering strategies + MAP (file loader) + AdaptiveOrder (ML selector)
- 8 benchmarks (PageRank, PageRank SpMV, BFS, CC, CC_SV, SSSP, BC, TC)
- ML-powered algorithm selection via AdaptiveOrder (7 selection modes: perceptron, decision tree, hybrid DT+perceptron, database/kNN, fastest-reorder, best-endtoend, best-amortization)
- Leiden community detection integration
- Researchers studying graph algorithms and cache optimization
- Engineers optimizing graph processing pipelines
- Students learning about graph algorithms
- Data scientists working with network data
- Comprehensive: 16 reordering algorithms in one framework
- ML-powered: AdaptiveOrder learns which algorithm works best
- Modern: Leiden community detection integration
- Practical: Based on GAP Benchmark Suite standards
- Linux or macOS
- GCC 7+ with C++17 support
- Make
- Python 3.8+ (optional, for scripts - no pip dependencies required)
- At least 4GB RAM (more for large graphs)
sudo apt-get update
sudo apt-get install build-essential git
git clone https://github.com/UVA-LavaLab/GraphBrew.git
cd GraphBrew
make allxcode-select --install
brew install gcc
git clone https://github.com/UVA-LavaLab/GraphBrew.git
cd GraphBrew
make all CXX=g++-13- GCC version:
g++ --version(need 7+) - C++17 support:
g++ -std=c++17 --version - OpenMP:
echo '#include <omp.h>' | g++ -fopenmp -x c++ - -c
See Installation for detailed troubleshooting.
./bench/bin/pr -f graph.el -s -n 3| Option | Meaning |
|---|---|
-f graph.el |
Input file |
-s |
Make undirected (symmetrize) |
-n 3 |
Run 3 trials |
-o 7 |
Use algorithm 7 (HUBCLUSTERDBG) |
| Situation | Recommendation |
|---|---|
| Don't know |
-o 14 (AdaptiveOrder) |
| Social network |
-o 12 (GraphBrewOrder) |
| General purpose |
-o 7 (HUBCLUSTERDBG) |
| Large graph |
-o 12 (GraphBrewOrder) |
| Baseline |
-o 0 (no reordering) |
Run multiple algorithms and compare:
for algo in 0 7 12 14 15; do
echo "=== Algorithm $algo ==="
./bench/bin/pr -f graph.el -s -o $algo -n 3
doneOr use AdaptiveOrder (-o 14) to auto-select.
| Format | Extension | Example |
|---|---|---|
| Edge list | .el |
0 1 |
| Weighted | .wel |
0 1 2.5 |
| Matrix Market | .mtx |
Standard MTX |
| DIMACS | .gr |
Road networks |
See Supported-Graph-Formats for details.
Reordering is preprocessing that pays off over multiple algorithm runs. The reordering step adds upfront cost, but subsequent benchmark runs are faster due to improved cache locality.
For repeated analyses, reorder once, save, reuse.
Speedups depend on graph topology, algorithm, and benchmark. High-modularity graphs (social, web) typically benefit more than low-modularity graphs (road networks). Run the full pipeline on your target graphs to measure actual improvements.
-
Large file: Use binary format
./bench/bin/converter -f graph.el -s -b graph.sg ./bench/bin/pr -f graph.sg -n 3
-
Text parsing: MTX/EL requires parsing; binary is instant
-
Memory: Ensure sufficient RAM
- Use binary graphs for repeated runs
-
Tune thread count:
export OMP_NUM_THREADS=8 -
Use NUMA binding:
numactl --cpunodebind=0 -
Reduce trials during development:
-n 1
Leiden is a community detection algorithm that finds densely connected groups of vertices. It improves on the popular Louvain algorithm with:
- Guaranteed connected communities
- Better quality partitions
- Faster convergence
GraphBrew uses Leiden to guide reordering decisions.
RabbitOrder (algorithm 8) has two variants:
| Variant | Command | Description |
|---|---|---|
csr |
-o 8 or -o 8:csr
|
Native CSR implementation (default, recommended) |
boost |
-o 8:boost |
Original Boost-based implementation (reference only) |
The CSR variant includes three correctness fixes over the original CSR code, making it match the Boost reference semantics, plus an auto-adaptive resolution parameter that further improves cache locality.
Resolution parameter: The CSR variant auto-tunes the Louvain resolution γ based on average
degree: γ = clamp(14 / avg_degree, 0.5, 1.0). Dense graphs (high avg degree) get a lower γ,
which prevents over-merging of communities. Override with the environment variable:
RABBIT_RESOLUTION=0.5 ./bench/bin/pr -f graph.el -s -o 8:csr -n 3Directed graphs: Both variants read only out-edges and use the same undirected modularity approximation from the original paper. The fixes are valid for both symmetric and directed inputs.
Recommendations:
- Use
csr(default) — faster, no external dependencies, better cache locality - Use
boostonly as a reference baseline for validation - The
boostvariant requires Boost 1.58.0:--install-boost
Computes graph features (15 linear + 3 quadratic), uses ML to select best algorithm for the graph with safety checks (OOD guardrail + ORIGINAL margin fallback). Supports 7 selection modes: perceptron (default), decision tree, hybrid DT+perceptron, database/kNN, fastest-reorder, best-endtoend, best-amortization. Default is full-graph mode (single algorithm for entire graph). See AdaptiveOrder-ML.
4-stage process: multi-restart perceptrons → variant-level weight saving → regret-aware grid search → save. Validate with python3 scripts/graphbrew_experiment.py --eval-weights. See Perceptron-Weights.
GraphBrewOrder is a strong general-purpose choice for most graph types. Recommended: -o 12.
w_dv_x_hub (power-law), w_mod_x_logn (large modular graphs), w_pf_x_wsr (uniform+cache). See AdaptiveOrder-ML#features-used.
Cache simulation was skipped, features weren't computed, or no benchmark data. Fix: --train --size small (full pipeline). See Perceptron-Weights#troubleshooting.
python3 scripts/graphbrew_experiment.py --eval-weights # Simulates C++ scoring, reports accuracy/regretSee Python-Scripts#-eval_weightspy---weight-evaluation--c-scoring-simulation.
All trained models (perceptron weights, decision trees, hybrid parameters) are stored in results/data/adaptive_models.json. The C++ runtime loads perceptron weights from this file via LoadPerceptronWeightsFromDB(). If the file is missing, hardcoded defaults are used. See Perceptron-Weights#weight-file-location.
- LeidenOrder (15): Baseline reference using GVE-Leiden external library (requires CSR→DiGraph conversion)
- GraphBrewOrder (12): Production Leiden + per-community reordering (e.g., RabbitOrder within each community), CSR-native, best quality
GraphBrewOrder uses Leiden community detection natively on CSR, then applies configurable per-community ordering for the best cache locality.
| Algorithm | Best For |
|---|---|
| DBG | Power-law graphs with clear hot/cold separation |
| HUBCLUSTER | Graphs where hubs connect to each other |
| HUBCLUSTERDBG | Combines both - good general choice |
See Troubleshooting for detailed solutions. Quick answers:
- Segfault: Check file exists, format correct (vertices start at 0), sufficient RAM
-
Results vary: Normal — use
-n 10, disable frequency scaling, usenumactl - No speedup: Not all graphs benefit — try AdaptiveOrder or different algorithm
- Python issues: Need Python 3.8+ (no pip required for core scripts)
See Adding-New-Algorithms for a complete guide:
- Add enum value in
reorder_types.h - Implement reorder function
- Add switch case
- (Optional) Add perceptron weights
See Adding-New-Benchmarks for a complete guide:
- Create
bench/src/my_algo.cc - Implement algorithm
- Add to Makefile
- Test
- Fork the repository
- Create a feature branch
- Make changes with tests
- Submit a pull request
See CONTRIBUTING.md for guidelines.
GraphBrew is released under the MIT License. See LICENSE.
| Source | URL | Formats |
|---|---|---|
| SNAP | snap.stanford.edu/data | Edge list |
| SuiteSparse | sparse.tamu.edu | MTX |
| Network Repository | networkrepository.com | Various |
| KONECT | konect.cc | Various |
# CSV to edge list
cat graph.csv | tr ',' ' ' | tail -n +2 > graph.el
# MTX to edge list (1-indexed to 0-indexed)
grep -v "^%" graph.mtx | tail -n +2 | awk '{print $1-1, $2-1}' > graph.elLimited by RAM:
- ~16 bytes per edge
- 1B edges ≈ 16GB RAM
- Larger graphs need out-of-core processing (not supported)
- Check Home for documentation overview
- Review Troubleshooting for common issues
- Open a GitHub issue for bugs
- Start a discussion for questions