Estimated Diameter
The diameter of a graph is the worst-case length of a shortest path between any pair of vertices in a graph. It is the farthest distance to travel, to get from one vertex to another, if you always take the shortest path. Finding the diameter requires calculating (the lengths of) all shortest paths, which can be quite slow.
This algorithm uses a simple heuristic to estimate the diameter. Rather than calculating the distance from each vertex to every other vertex, it selects \$K\$ vertices randomly, where \$K\$ is a user-provided parameter. It calculates the distances from each of these \$K\$ vertices to all other vertices. So, instead of calculating \$V*(V-1)\$ distances, this algorithm only calculates \$K*(V-1)\$ distances. The higher the value of \$K\$, the greater the likelihood of finding the true diameter.
Notes
This algorithm query employs a subquery called max_BFS_depth. Both queries must be installed to run the algorithm.
The current version of this algorithm only computes unweighted distances.
Specifications
tg_estimate_diameter (SET<STRING> v_type, SET<STRING> e_type, INT seed_set_length,
BOOL print_accum = TRUE, STRING file_path = "", BOOL display = FALSE)
Parameters
Parameter | Description | Default |
---|---|---|
|
Names of vertex types to use |
(empty set of strings) |
|
Names of edge types to use |
(empty set of strings) |
|
The number \$k\$ of random seed vertices to use |
10 |
|
If True, output JSON to standard output |
False |
|
If not empty, write output to this file |
(empty string) |
Example
We can estimate the diameter of the graph included in the Shortest Path Algorithms TGCloud Starter Kit.
This graph contains data for 7,927 Airport
vertices and 19,257 flight_route
edges.
With seed_set_length
set to 10
, the estimated diameter returned is 9
.
With a larger seed_set_length
of 100
, the new, more accurate, estimated diameter returned is 12
,
because the greater number of randomly selected vertices happened to capture a larger worst-case scenario of graph diameter.