Pathfinding

History

Path finding has a long history, and is considered to be one of the classical graph problems; it has been researched as far back as the 19th century. It gained prominence in the early 1950s in the context of ‘alternate routing’, i.e. finding a second shortest route if the shortest route is blocked.

Dijkstra came up with his algorithm in 1956 while trying to come up with something to show off the new ARMAC computers. He needed to find a problem and solution that people not familiar with computing would be able to understand, and designed what is now known as Dijkstra’s algorithm. He later implemented it for a slightly simplified transportation map of 64 cities in the Netherlands.

../../_images/Dijkstra_armac.jpg

Overview of the Paths

Overview of the pathfinding algorithms

Type

What it does

Example use

Shortest Path

Calculates the shortest path between a pair of nodes

Finding driving directions between two locations

All Pairs Shortest Path

Calculates the shortest path between all pairs of nodes in the graph

Evaluating alternate routes around a traffic jam

Single Source Shortest Path

Calculates the shorest path between a single root node and all other nodes

Least cost routing of phone calls

Minimum Spanning Tree

Calculates the path in a connected tree structure with the smallest cost for visiting all nodes

Optimizing connected routing, such as laying cable or garbage collection

Random Walk

Returns a list of nodes along a path of specified size by randomly choosing relationships to traverse.

Augmenting training for machine learning or data for graph algorithms.

Example Data: The Transport Graph

csv-file for the nodes (download)

csv_file for the connections between the cities (download)

Loading Nodes (Cities)

LOAD CSV WITH HEADERS FROM "file:///data_transport-nodes.csv"  AS row
MERGE (place:Place {id:row.id})
SET place.latitude = toFloat(row.latitude),
    place.longitude = toFloat(row.longitude),
    place.population = toInteger(row.population)

Creating the connections

LOAD CSV WITH HEADERS FROM "file:///data_transport-relationships.csv" as row
MATCH (origin:Place {id:row.src})
MATCH (destination:Place {id:row.dst})
MERGE (origin)-[:EROAD {distance: toInteger(row.cost)}]->(destination)
../../_images/map1.png

Graph Catalog

CALL gds.graph.create('myGraph','Place','EROAD',{relationshipProperties: 'distance'})

Shortest Path

The Shortest Path algorithm calculates the shortest (weighted) path between a pair of nodes.

Use Shortest Path to find optimal routes between a pair of nodes, based on either the number of hops or any weighted relationship value. For example, it can provide real-time answers about degrees of separation, the shortest distance between points, or the least expensive route. You can also use this algorithm to simply explore the connections between particular nodes.

../../_images/shortest1.png

Example use cases include:

  • Finding directions between locations (Google Maps)

  • Finding the degrees of separation between people in social networks.

  • Finding the number of degrees of separation between an actor and Kevin Bacon based on the movies they’ve appeared in (the Bacon Number). An example of this can be seen on the Oracle of Bacon website.

  • Calculation of the Erdos number of a mathenatician.

Note

You’ll notice in graph analytics the use of the terms weight, cost, distance, and hop when describing relationships and paths.

  • Weight is the numeric value of a particular property of a relationship.

  • Cost is used similarly, but we’ll see it more often when considering the total weight of a path.

  • Distance is often used within an algorithm as the name of the relationship property that indicates the cost of traversing between a pair of nodes.

  • Hop is commonly used to express the number of relationships between two nodes.

You may see some of these terms combined, as in “It’s a five-hop distance to London” or “That’s the lowest cost for the distance.”

MATCH (source:Place {id: 'Amsterdam'}), (target:Place {id: 'London'})
CALL gds.shortestPath.dijkstra.stream('myGraph', {
   sourceNode: source,
   targetNode: target,
   relationshipWeightProperty: 'distance'})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN index,
gds.util.asNode(sourceNode).name AS sourceNodeName,
gds.util.asNode(targetNode).name AS targetNodeName,
totalCost,
[nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
costs,
nodes(path) as path
ORDER BY index
../../_images/path.png

All Pairs Shortest Paths

Shortest path from every node to every other node

../../_images/shortest2.png

Single Source Shortest Path

../../_images/shortest3.png

Minimum Spanning Tree

The Minimum (Weight) Spanning Tree algorithm starts from a given node and finds all its reachable nodes and the set of relationships that connect the nodes together with the minimum possible weight. It traverses to the next unvisited node with the lowest weight from any visited node, avoiding cycles.

The Minimum Spanning Tree algorithm operates as demonstrated in the figure below:

../../_images/span1.png

Use Minimum Spanning Tree when you need the best route to visit all nodes.

Example use cases include:

Last change: Oct 30, 2023