Graph Data Science

Traverse ships with a built-in library of 36+ graph algorithms covering centrality, community detection, pathfinding, similarity, link prediction, and node embeddings. Every algorithm is callable from Cypher with a single CALL statement.

How to call an algorithm

CALL traverse.<algorithm>.<mode>(<config>) YIELD ...

Three pieces:

<algorithm> - the algorithm name (camelCase), e.g. pageRank, dijkstra, louvain.
<mode> - how the result is delivered (see Modes).
<config> - a map of options the algorithm understands.

Worked example: PageRank

CALL traverse.pageRank.stream({maxIterations: 20})
YIELD nodeId, score
RETURN nodeId, score
ORDER BY score DESC
LIMIT 10

Returns one row per node with its computed score. The ORDER BY ... LIMIT is a regular Cypher tail; the algorithm just produces the stream.

Modes

Mode	Returns	Use when
`stream`	One row per result entity (node, pair, path).	You want to inspect or post-process individual scores.
`stats`	A single summary row.	You want aggregate measures (mean, max, community count) without materializing every score.
`write`	A single summary row, plus a write-back of the result to a node property.	You want the algorithm output to be queryable as part of the graph afterwards (e.g. `n.pagerank`).

Not every algorithm supports every mode. The catalog tables below show which are available per algorithm.

Stream

CALL traverse.pageRank.stream({maxIterations: 20})
YIELD nodeId, score
RETURN nodeId, score LIMIT 10

Stats

CALL traverse.pageRank.stats({maxIterations: 20})
YIELD nodeCount, min, max, mean, computeMillis
RETURN nodeCount, min, max, mean, computeMillis

Write

CALL traverse.pageRank.write({maxIterations: 20, writeProperty: 'pagerank'})
YIELD nodePropertiesWritten
RETURN nodePropertiesWritten

After running the write mode, the score is queryable like any other property:

MATCH (n:User) RETURN n.name, n.pagerank ORDER BY n.pagerank DESC LIMIT 10

Common config

Most algorithms accept this set of generic options in addition to their algorithm-specific config:

Option	Type	Description
`nodeLabels`	list of strings	Restrict the algorithm to nodes carrying any of these labels. Empty / omitted = all nodes.
`relationshipTypes`	list of strings	Restrict the algorithm to edges of these types. Empty / omitted = all edges.
`orientation`	`NATURAL` / `REVERSE` / `UNDIRECTED`	How edge direction is treated. Default `NATURAL`.
`writeProperty`	string	Required in `write` mode. Name of the property to set on each affected node.

Algorithm catalog

Centrality

Score nodes by how "important" they are in the graph.

Name	Display	What it computes	Modes
`pageRank`	PageRank	Rank by inbound influence following a damped random walk.	stream / stats / write
`articleRank`	Article Rank	PageRank variant penalizing high-out-degree neighbors.	stream / stats / write
`eigenvector`	Eigenvector Centrality	Power-iteration on the adjacency matrix.	stream / stats / write
`closeness`	Closeness Centrality	Inverse of mean shortest-path distance to all other nodes.	stream / stats / write
`harmonicCloseness`	Harmonic Closeness	Closeness variant that handles disconnected graphs.	stream / stats / write
`betweenness`	Betweenness Centrality	Fraction of shortest paths each node lies on (Brandes).	stream / stats / write
`hits`	HITS (Hubs and Authorities)	Two scores per node: hub and authority.	stream / stats / write
`degree`	Degree Centrality	Count of incident edges per node.	stream / stats / write
`celf`	CELF Influence Maximization	Greedy seed-set selection under the linear-threshold model.	stream / stats

Community detection

Group nodes into communities based on connectivity.

Name	Display	What it computes	Modes
`louvain`	Louvain	Modularity-maximizing community detection (multilevel).	stream / stats / write
`leiden`	Leiden	Louvain refinement that guarantees well-connected communities.	stream / stats / write
`labelPropagation`	Label Propagation	Iterative neighbor-majority label assignment.	stream / stats / write
`wcc`	Weakly Connected Components	Component id ignoring edge direction.	stream / stats / write
`scc`	Strongly Connected Components	Component id respecting edge direction (Tarjan).	stream / stats / write
`kCore`	K-Core Decomposition	Per-node k-core number.	stream / stats / write
`triangleCount`	Triangle Count + LCC	Triangle count per node and local clustering coefficient.	stream / stats / write
`conductance`	Conductance	Per-community conductance, given an existing community assignment.	stats
`modularity`	Modularity Score	Single modularity number, given an existing community assignment.	stats

Paths and traversal

Find shortest paths, multi-hop traversals, and tree structures.

Name	Display	What it computes	Modes
`dijkstra`	Dijkstra Shortest Path	Single-source shortest paths with non-negative weights.	stream / stats
`aStar`	A* Shortest Path	Heuristic-guided shortest path between two nodes.	stream / stats
`bellmanFord`	Bellman-Ford	Single-source shortest paths supporting negative weights.	stream / stats
`yens`	Yen's K-Shortest Paths	The k loop-less shortest paths between two nodes.	stream / stats
`allShortestPaths`	All-Pairs Shortest Paths	Shortest path between every reachable pair of source-set nodes.	stream / stats
`deltaStepping`	Δ-Stepping	Parallel single-source shortest paths.	stream / stats
`bfs`	Breadth-First Search	BFS traversal from a source node.	stream / stats
`dfs`	Depth-First Search	DFS traversal from a source node.	stream / stats
`randomWalk`	Random Walk	Sample weighted random walks of fixed length from each source.	stream / stats
`spanningTree`	Spanning Tree (Kruskal)	Minimum spanning tree of the graph.	stream / stats
`steinerTree`	Steiner Tree	Minimum-weight tree connecting a set of terminal nodes (2-approx).	stream / stats

DAG

Algorithms that require a directed acyclic graph.

Name	Display	What it computes	Modes
`topologicalSort`	Topological Sort	Linear ordering of nodes respecting edge direction.	stream / stats / write
`longestPath`	Longest Path (DAG)	Per-node longest path length from any source.	stream / stats / write

Similarity

Score how similar two nodes are.

Name	Display	What it computes	Modes
`nodeSimilarity`	Node Similarity (Jaccard)	Pairs of nodes ranked by Jaccard over their neighborhoods.	stream / stats
`cosineSimilarity`	Cosine Similarity	Pairs ranked by cosine over neighbor-membership vectors. \|N(u)∩N(v)\| / sqrt(\|N(u)\|·\|N(v)\|).	stream / stats
`knn`	k-Nearest Neighbors (structural)	Top-k Jaccard-similar nodes per source, evaluated over the two-hop candidate set.	stream / stats
`knnProperty`	k-Nearest Neighbors (by node properties)	Top-k by Euclidean similarity over numeric node-property vectors (e.g. embeddings). NN-Descent random sampling, O(n·K²·d) per iteration. Matches Neo4j's `gds.knn` semantics.	stream / stats

Link prediction

Score node pairs by how likely an edge between them is.

Name	Display	What it computes	Modes
`commonNeighbors`	Common Neighbors	Count of shared neighbors.	stream
`totalNeighbors`	Total Neighbors	Union of neighborhoods.	stream
`preferentialAttachment`	Preferential Attachment	Product of node degrees.	stream
`adamicAdar`	Adamic-Adar	Common Neighbors weighted by inverse log-degree.	stream
`resourceAllocation`	Resource Allocation	Common Neighbors weighted by inverse degree.	stream
`sameCommunity`	Same Community	1 if both nodes share a community assignment, else 0.	stream

Embeddings

Produce a dense or sparse vector per node, useful as input to ML models or downstream similarity / kNN.

Name	Display	What it computes	Modes
`fastRP`	FastRP	Random-projection embeddings (very fast, surprisingly strong).	stream / stats / write
`node2vec`	Node2Vec	Biased random walks + skip-gram (Grover-Leskovec).	stream / stats / write
`hashGnn`	HashGNN	Hash-based message passing producing binary embeddings.	stream / stats / write
`graphSage`	GraphSAGE	Inductive message-passing embedding (mean aggregator).	stream / stats / write

Worked examples

1. Top-10 most central airports

Using the OpenFlights sample (airports + routes):

CALL traverse.pageRank.stream({maxIterations: 30})
YIELD nodeId, score
MATCH (a:Airport) WHERE id(a) = nodeId
RETURN a.iata_code, a.name, score
ORDER BY score DESC LIMIT 10

2. Shortest flight path from JFK to LAX

MATCH (src:Airport {iata_code:'JFK'}), (dst:Airport {iata_code:'LAX'})
CALL traverse.dijkstra.stream({
    sourceNodeId: id(src),
    targetNodeId: id(dst)
})
YIELD nodes, costs
RETURN nodes, costs

3. Detect communities in a social graph and write them back

Using the Pokec sample:

CALL traverse.louvain.write({
    writeProperty: 'community',
    maxIterations: 10
})
YIELD nodeCount, communityCount, largestCommunitySize, computeMillis
RETURN nodeCount, communityCount, largestCommunitySize, computeMillis

Afterwards the result is just a node property:

MATCH (u:User) RETURN u.community, count(*) AS members
ORDER BY members DESC LIMIT 10

4. Generate FastRP embeddings as a node property

CALL traverse.fastRP.write({
    writeProperty: 'embedding',
    embeddingDimension: 128
})
YIELD nodeCount, dimension, nodePropertiesWritten
RETURN nodeCount, dimension, nodePropertiesWritten

Afterwards the vector is just another node property:

MATCH (n) RETURN id(n), n.embedding LIMIT 5

5. Find structurally-similar users via kNN

The structural knn ranks each node's neighbors by Jaccard overlap of their two-hop neighborhoods - useful for "people who follow similar accounts" style recommendations:

CALL traverse.knn.stream({topK: 5, similarityCutoff: 0.1})
YIELD node1, node2, similarity
RETURN node1, node2, similarity
ORDER BY similarity DESC LIMIT 20

6. Embedding-space recommendations via FastRP + knnProperty

Step 1 writes a 128-dim FastRP embedding to each node, step 2 finds the top-5 nearest neighbors in that vector space. This matches the canonical Neo4j GDS feature-similarity pipeline:

// 1. Generate 128-dim embeddings
CALL traverse.fastRP.write({
    writeProperty: 'embedding',
    embeddingDimension: 128
})
YIELD nodeCount, dimension, nodePropertiesWritten
RETURN nodeCount, dimension, nodePropertiesWritten;

// 2. Top-5 similar nodes in embedding space
CALL traverse.knnProperty.stream({
    nodeProperties: ['embedding'],
    topK: 5,
    similarityCutoff: 0.7
})
YIELD node1, node2, similarity
RETURN node1, node2, similarity
ORDER BY similarity DESC LIMIT 20

For tabular features (no embedding step) pass the raw numeric properties instead: nodeProperties: ['age', 'tenure_months', 'avg_session_minutes'].

Stability

Each algorithm carries a stability tag visible in Studio's GDS drawer:

stable - API and output format are settled; safe to depend on.
beta - works correctly but the parameter set or output shape may evolve.
experimental - subject to change without notice.

Studio

The GDS drawer in Studio exposes the same catalog with a form-driven config builder. It writes the same CALL traverse.<algo>.<mode>(...) Cypher you'd write by hand, so anything you do in the drawer is portable to scripts and applications.

gds.* Compatibility (0.8.0)

Alongside the native traverse.* catalog, Traverse implements the gds.* procedure surface, so scripts written for the GDS plugin run with minimal changes:

Graph catalog - gds.graph.project creates named in-memory projections (materialized as dense CSR snapshots), with gds.graph.list / gds.graph.drop to manage them
Algorithm execution on projections - the common algorithms (gds.pageRank, gds.louvain, gds.wcc, gds.nodeSimilarity, shortest paths, embeddings, and more) in stream / stats / write modes
Mutate mode - .mutate writes algorithm results as property overlays on the projection, so pipelines like fastRP.mutate → knn on the embedding compose without touching the underlying graph
Link-prediction functions - gds.alpha.linkprediction.* (Adamic-Adar, common neighbors, preferential attachment, resource allocation)

CALL gds.graph.project('social', 'Person', 'KNOWS');
CALL gds.pageRank.mutate('social', {mutateProperty: 'pr'});
CALL gds.pageRank.stream('social') YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score ORDER BY score DESC LIMIT 10;