Graph Data Science

Traverse ships with a built-in library of 35+ graph algorithms covering centrality, community detection, pathfinding, similarity, link prediction, and node embeddings. Every algorithm is callable from Cypher with a single CALL statement.

How to call an algorithm

CALL traverse.<algorithm>.<mode>(<config>) YIELD ...

Three pieces:

  • <algorithm> – the algorithm name (camelCase), e.g. pageRank, dijkstra, louvain.
  • <mode> – how the result is delivered (see Modes).
  • <config> – a map of options the algorithm understands.

Worked example: PageRank

CALL traverse.pageRank.stream({maxIterations: 20})
YIELD nodeId, score
RETURN nodeId, score
ORDER BY score DESC
LIMIT 10

Returns one row per node with its computed score. The ORDER BY ... LIMIT is a regular Cypher tail; the algorithm just produces the stream.

Modes

ModeReturnsUse when
streamOne row per result entity (node, pair, path).You want to inspect or post-process individual scores.
statsA single summary row.You want aggregate measures (mean, max, community count) without materializing every score.
writeA single summary row, plus a write-back of the result to a node property.You want the algorithm output to be queryable as part of the graph afterwards (e.g. n.pagerank).

Not every algorithm supports every mode. The catalog tables below show which are available per algorithm.

Stream

CALL traverse.pageRank.stream({maxIterations: 20})
YIELD nodeId, score
RETURN nodeId, score LIMIT 10

Stats

CALL traverse.pageRank.stats({maxIterations: 20})
YIELD nodeCount, min, max, mean, computeMillis
RETURN nodeCount, min, max, mean, computeMillis

Write

CALL traverse.pageRank.write({maxIterations: 20, writeProperty: 'pagerank'})
YIELD nodePropertiesWritten
RETURN nodePropertiesWritten

After running the write mode, the score is queryable like any other property:

MATCH (n:User) RETURN n.name, n.pagerank ORDER BY n.pagerank DESC LIMIT 10

Common config

Most algorithms accept this set of generic options in addition to their algorithm-specific config:

OptionTypeDescription
nodeLabelslist of stringsRestrict the algorithm to nodes carrying any of these labels. Empty / omitted = all nodes.
relationshipTypeslist of stringsRestrict the algorithm to edges of these types. Empty / omitted = all edges.
orientationNATURAL / REVERSE / UNDIRECTEDHow edge direction is treated. Default NATURAL.
writePropertystringRequired in write mode. Name of the property to set on each affected node.

Algorithm catalog

Centrality

Score nodes by how "important" they are in the graph.

NameDisplayWhat it computesModes
pageRankPageRankRank by inbound influence following a damped random walk.stream / stats / write
articleRankArticle RankPageRank variant penalizing high-out-degree neighbors.stream / stats / write
eigenvectorEigenvector CentralityPower-iteration on the adjacency matrix.stream / stats / write
closenessCloseness CentralityInverse of mean shortest-path distance to all other nodes.stream / stats / write
harmonicClosenessHarmonic ClosenessCloseness variant that handles disconnected graphs.stream / stats / write
betweennessBetweenness CentralityFraction of shortest paths each node lies on (Brandes).stream / stats / write
hitsHITS (Hubs and Authorities)Two scores per node: hub and authority.stream / stats / write
degreeDegree CentralityCount of incident edges per node.stream / stats / write
celfCELF Influence MaximizationGreedy seed-set selection under the linear-threshold model.stream / stats

Community detection

Group nodes into communities based on connectivity.

NameDisplayWhat it computesModes
louvainLouvainModularity-maximizing community detection (multilevel).stream / stats / write
leidenLeidenLouvain refinement that guarantees well-connected communities.stream / stats / write
labelPropagationLabel PropagationIterative neighbor-majority label assignment.stream / stats / write
wccWeakly Connected ComponentsComponent id ignoring edge direction.stream / stats / write
sccStrongly Connected ComponentsComponent id respecting edge direction (Tarjan).stream / stats / write
kCoreK-Core DecompositionPer-node k-core number.stream / stats / write
triangleCountTriangle Count + LCCTriangle count per node and local clustering coefficient.stream / stats / write
conductanceConductancePer-community conductance, given an existing community assignment.stats
modularityModularity ScoreSingle modularity number, given an existing community assignment.stats

Paths and traversal

Find shortest paths, multi-hop traversals, and tree structures.

NameDisplayWhat it computesModes
dijkstraDijkstra Shortest PathSingle-source shortest paths with non-negative weights.stream / stats
aStarA* Shortest PathHeuristic-guided shortest path between two nodes.stream / stats
bellmanFordBellman-FordSingle-source shortest paths supporting negative weights.stream / stats
yensYen's K-Shortest PathsThe k loop-less shortest paths between two nodes.stream / stats
allShortestPathsAll-Pairs Shortest PathsShortest path between every reachable pair of source-set nodes.stream / stats
deltaSteppingΔ-SteppingParallel single-source shortest paths.stream / stats
bfsBreadth-First SearchBFS traversal from a source node.stream / stats
dfsDepth-First SearchDFS traversal from a source node.stream / stats
randomWalkRandom WalkSample weighted random walks of fixed length from each source.stream / stats
spanningTreeSpanning Tree (Kruskal)Minimum spanning tree of the graph.stream / stats
steinerTreeSteiner TreeMinimum-weight tree connecting a set of terminal nodes (2-approx).stream / stats

DAG

Algorithms that require a directed acyclic graph.

NameDisplayWhat it computesModes
topologicalSortTopological SortLinear ordering of nodes respecting edge direction.stream / stats / write
longestPathLongest Path (DAG)Per-node longest path length from any source.stream / stats / write

Similarity

Score how similar two nodes are.

NameDisplayWhat it computesModes
nodeSimilarityNode Similarity (Jaccard)Pairs of nodes ranked by Jaccard over their neighborhoods.stream / stats
cosineSimilarityCosine SimilarityPairs ranked by cosine over neighbor-membership vectors. |N(u)∩N(v)| / sqrt(|N(u)|·|N(v)|).stream / stats
knnk-Nearest Neighbors (structural)Top-k Jaccard-similar nodes per source, evaluated over the two-hop candidate set.stream / stats
knnPropertyk-Nearest Neighbors (by node properties)Top-k by Euclidean similarity over numeric node-property vectors (e.g. embeddings). NN-Descent random sampling, O(n·K²·d) per iteration. Matches Neo4j's gds.knn semantics.stream / stats

Link prediction

Score node pairs by how likely an edge between them is.

NameDisplayWhat it computesModes
commonNeighborsCommon NeighborsCount of shared neighbors.stream
totalNeighborsTotal NeighborsUnion of neighborhoods.stream
preferentialAttachmentPreferential AttachmentProduct of node degrees.stream
adamicAdarAdamic-AdarCommon Neighbors weighted by inverse log-degree.stream
resourceAllocationResource AllocationCommon Neighbors weighted by inverse degree.stream
sameCommunitySame Community1 if both nodes share a community assignment, else 0.stream

Embeddings

Produce a dense or sparse vector per node, useful as input to ML models or downstream similarity / kNN.

NameDisplayWhat it computesModes
fastRPFastRPRandom-projection embeddings (very fast, surprisingly strong).stream / stats / write
node2vecNode2VecBiased random walks + skip-gram (Grover-Leskovec).stream / stats / write
hashGnnHashGNNHash-based message passing producing binary embeddings.stream / stats / write
graphSageGraphSAGEInductive message-passing embedding (mean aggregator).stream / stats / write

Worked examples

1. Top-10 most central airports

Using the OpenFlights sample (airports + routes):

CALL traverse.pageRank.stream({maxIterations: 30})
YIELD nodeId, score
MATCH (a:Airport) WHERE id(a) = nodeId
RETURN a.iata_code, a.name, score
ORDER BY score DESC LIMIT 10

2. Shortest flight path from JFK to LAX

MATCH (src:Airport {iata_code:'JFK'}), (dst:Airport {iata_code:'LAX'})
CALL traverse.dijkstra.stream({
    sourceNodeId: id(src),
    targetNodeId: id(dst)
})
YIELD nodes, costs
RETURN nodes, costs

3. Detect communities in a social graph and write them back

Using the Pokec sample:

CALL traverse.louvain.write({
    writeProperty: 'community',
    maxIterations: 10
})
YIELD nodeCount, communityCount, largestCommunitySize, computeMillis
RETURN nodeCount, communityCount, largestCommunitySize, computeMillis

Afterwards the result is just a node property:

MATCH (u:User) RETURN u.community, count(*) AS members
ORDER BY members DESC LIMIT 10

4. Generate FastRP embeddings as a node property

CALL traverse.fastRP.write({
    writeProperty: 'embedding',
    embeddingDimension: 128
})
YIELD nodeCount, dimension, nodePropertiesWritten
RETURN nodeCount, dimension, nodePropertiesWritten

Afterwards the vector is just another node property:

MATCH (n) RETURN id(n), n.embedding LIMIT 5

5. Find structurally-similar users via kNN

The structural knn ranks each node's neighbors by Jaccard overlap of their two-hop neighborhoods — useful for "people who follow similar accounts" style recommendations:

CALL traverse.knn.stream({topK: 5, similarityCutoff: 0.1})
YIELD node1, node2, similarity
RETURN node1, node2, similarity
ORDER BY similarity DESC LIMIT 20

6. Embedding-space recommendations via FastRP + knnProperty

Step 1 writes a 128-dim FastRP embedding to each node, step 2 finds the top-5 nearest neighbors in that vector space. This matches the canonical Neo4j GDS feature-similarity pipeline:

// 1. Generate 128-dim embeddings
CALL traverse.fastRP.write({
    writeProperty: 'embedding',
    embeddingDimension: 128
})
YIELD nodeCount, dimension, nodePropertiesWritten
RETURN nodeCount, dimension, nodePropertiesWritten;

// 2. Top-5 similar nodes in embedding space
CALL traverse.knnProperty.stream({
    nodeProperties: ['embedding'],
    topK: 5,
    similarityCutoff: 0.7
})
YIELD node1, node2, similarity
RETURN node1, node2, similarity
ORDER BY similarity DESC LIMIT 20

For tabular features (no embedding step) pass the raw numeric properties instead: nodeProperties: ['age', 'tenure_months', 'avg_session_minutes'].

Stability

Each algorithm carries a stability tag visible in Studio's GDS drawer:

  • stable – API and output format are settled; safe to depend on.
  • beta – works correctly but the parameter set or output shape may evolve.
  • experimental – subject to change without notice.

Studio

The GDS drawer in Studio exposes the same catalog with a form-driven config builder. It writes the same CALL traverse.<algo>.<mode>(...) Cypher you'd write by hand, so anything you do in the drawer is portable to scripts and applications.

See also