Benchmarking
The Traverse Bench tool (traverse-bench) downloads datasets, imports data, and runs standardized benchmarks against any Bolt-compatible graph database.
Quick Start
# Download the medium pokec dataset
traverse-bench --download medium
# Import into a running server
traverse-bench --import medium --port 7690
# Run benchmarks
traverse-bench --variant medium --port 7690 --duration 10 --warmup 3
Dataset Variants
| Variant | Nodes | Edges | Use Case |
|---|---|---|---|
small | ~10K | ~100K | Quick smoke tests |
medium | ~100K | ~1M | Standard benchmarks |
large | ~1.6M | ~30M | Production-scale tests |
All variants use the Pokec social network dataset with :User nodes and :Friend edges.
CLI Options
| Flag | Description | Default |
|---|---|---|
--variant <NAME> | Dataset variant: small, medium, large | small |
--duration <SECS> | Measurement time per query (seconds) | 10 |
--warmup <N> | Warmup iterations before measurement | 3 |
--concurrency <N> | Number of parallel workers | 1 |
--groups <LIST> | Comma-separated query groups or individual query names | All groups |
--list-queries | List all available queries and groups, then exit | — |
--host <ADDR> | Bolt server hostname | 127.0.0.1 |
--port <PORT> | Bolt port | 7690 |
--auth <USER:PASS> | Authentication credentials | — |
--server-pid <PID> | Server PID for resource profiling (auto-detected if omitted) | — |
--output, -o <FILE> | JSON output file path | — |
--import <VARIANT|FILE> | Import data and exit (variant name or .cypher file path) | — |
--download <VARIANT> | Download dataset and exit | — |
--format <FMT> | Download format: cypher or csv | cypher |
--download-dir <DIR> | Output directory for downloads | benchmarks/data/dataset_cache |
Download
# Download as Cypher (for pipelined Bolt import)
traverse-bench --download medium
# Download as CSV (for Neo4j native import)
traverse-bench --download large --format csv
# Custom output directory
traverse-bench --download medium --download-dir /tmp/datasets
Import
Import a downloaded dataset (or a custom .cypher file) into a running server via pipelined Bolt (512 statements per batch):
# Import a variant
traverse-bench --import medium --port 7690
# Import a custom file
traverse-bench --import /path/to/data.cypher --port 7690
Indexes are created automatically before data import.
Importing into Neo4j
For Neo4j, the pipelined Bolt import can be very slow on larger datasets (hours for the large variant). Instead, download the CSV format and use neo4j-admin database import for a much faster bulk load:
# Download as CSV files prepared for neo4j-admin
traverse-bench --download large --format csv
# Use neo4j-admin for fast import
neo4j-admin database import full \
--nodes=User=benchmarks/data/dataset_cache/pokec_large_nodes.csv \
--relationships=Friend=benchmarks/data/dataset_cache/pokec_large_edges.csv \
neo4j
The CSV files include the headers required by neo4j-admin (:ID, :START_ID, :END_ID, etc.).
Query Groups
Queries are organized into groups. Use --list-queries to see all available queries.
| Group | Queries | Description |
|---|---|---|
read | 5 | Point lookups, property reads, short pattern matches |
expansion | 13 | Multi-hop traversals (1–4 hops) with and without filters |
scan | 8 | Full scans, scan+expand, scan+aggregate |
aggregate | 4 | count, min/max/avg, group by |
shortest_path | 2 | shortestPath and allShortestPaths |
update | 1 | SET property on matched node |
write | 2 | CREATE node and CREATE edge |
# Run only read and expansion groups
traverse-bench --variant medium --groups read,expansion
# Run a single query by name
traverse-bench --variant medium --groups single_vertex_read
Output Metrics
Each query produces:
- Iterations — total queries executed during measurement
- Errors — failed queries
- Throughput — queries per second (QPS)
- Latency — avg, p50, p95, p99, max (milliseconds)
When --server-pid is provided (or auto-detected), resource profiling tracks CPU and memory usage per query.
Use --output to save results as JSON for comparison:
traverse-bench --variant medium --output results.json
Cross-Database Comparison
Because traverse-bench uses the Bolt protocol, it can benchmark any Bolt-compatible database. Run the same benchmark against different servers by changing --port:
# Benchmark Traverse (port 7690)
traverse-bench --variant medium --port 7690 -o traverse.json
# Benchmark Neo4j (port 7687)
traverse-bench --variant medium --port 7687 -o neo4j.json
# Benchmark Memgraph (port 7689)
traverse-bench --variant medium --port 7689 -o memgraph.json