Benchmarking

The Traverse Bench tool (traverse-bench) downloads datasets, imports data, and runs standardized benchmarks against any Bolt-compatible graph database.

Quick Start

# Download the medium pokec dataset
traverse-bench --download medium

# Import into a running server
traverse-bench --import medium --port 7690

# Run benchmarks
traverse-bench --variant medium --port 7690 --duration 10 --warmup 3

Dataset Variants

VariantNodesEdgesUse Case
small~10K~100KQuick smoke tests
medium~100K~1MStandard benchmarks
large~1.6M~30MProduction-scale tests

All variants use the Pokec social network dataset with :User nodes and :Friend edges.

CLI Options

FlagDescriptionDefault
--variant <NAME>Dataset variant: small, medium, largesmall
--duration <SECS>Measurement time per query (seconds)10
--warmup <N>Warmup iterations before measurement3
--concurrency <N>Number of parallel workers1
--groups <LIST>Comma-separated query groups or individual query namesAll groups
--list-queriesList all available queries and groups, then exit
--host <ADDR>Bolt server hostname127.0.0.1
--port <PORT>Bolt port7690
--auth <USER:PASS>Authentication credentials
--server-pid <PID>Server PID for resource profiling (auto-detected if omitted)
--output, -o <FILE>JSON output file path
--import <VARIANT|FILE>Import data and exit (variant name or .cypher file path)
--download <VARIANT>Download dataset and exit
--format <FMT>Download format: cypher or csvcypher
--download-dir <DIR>Output directory for downloadsbenchmarks/data/dataset_cache

Download

# Download as Cypher (for pipelined Bolt import)
traverse-bench --download medium

# Download as CSV (for Neo4j native import)
traverse-bench --download large --format csv

# Custom output directory
traverse-bench --download medium --download-dir /tmp/datasets

Import

Import a downloaded dataset (or a custom .cypher file) into a running server via pipelined Bolt (512 statements per batch):

# Import a variant
traverse-bench --import medium --port 7690

# Import a custom file
traverse-bench --import /path/to/data.cypher --port 7690

Indexes are created automatically before data import.

Importing into Neo4j

For Neo4j, the pipelined Bolt import can be very slow on larger datasets (hours for the large variant). Instead, download the CSV format and use neo4j-admin database import for a much faster bulk load:

# Download as CSV files prepared for neo4j-admin
traverse-bench --download large --format csv

# Use neo4j-admin for fast import
neo4j-admin database import full \
  --nodes=User=benchmarks/data/dataset_cache/pokec_large_nodes.csv \
  --relationships=Friend=benchmarks/data/dataset_cache/pokec_large_edges.csv \
  neo4j

The CSV files include the headers required by neo4j-admin (:ID, :START_ID, :END_ID, etc.).

Query Groups

Queries are organized into groups. Use --list-queries to see all available queries.

GroupQueriesDescription
read5Point lookups, property reads, short pattern matches
expansion13Multi-hop traversals (1–4 hops) with and without filters
scan8Full scans, scan+expand, scan+aggregate
aggregate4count, min/max/avg, group by
shortest_path2shortestPath and allShortestPaths
update1SET property on matched node
write2CREATE node and CREATE edge
# Run only read and expansion groups
traverse-bench --variant medium --groups read,expansion

# Run a single query by name
traverse-bench --variant medium --groups single_vertex_read

Output Metrics

Each query produces:

  • Iterations — total queries executed during measurement
  • Errors — failed queries
  • Throughput — queries per second (QPS)
  • Latency — avg, p50, p95, p99, max (milliseconds)

When --server-pid is provided (or auto-detected), resource profiling tracks CPU and memory usage per query.

Use --output to save results as JSON for comparison:

traverse-bench --variant medium --output results.json

Cross-Database Comparison

Because traverse-bench uses the Bolt protocol, it can benchmark any Bolt-compatible database. Run the same benchmark against different servers by changing --port:

# Benchmark Traverse (port 7690)
traverse-bench --variant medium --port 7690 -o traverse.json

# Benchmark Neo4j (port 7687)
traverse-bench --variant medium --port 7687 -o neo4j.json

# Benchmark Memgraph (port 7689)
traverse-bench --variant medium --port 7689 -o memgraph.json