Weighted PageRank

Supported Graph Characteristics

Weighted edges

Directed edges

Undirected edges

Homogeneous vertex types

The only difference between weighted PageRank and standard PageRank is that edges have weights, and the influence that a vertex receives from an in-neighbor is multiplied by the weight of the in-edge.

Specifications

tg_pagerank_wt (SET<STRING> v_type, SET<STRING> e_type, STRING wt_attr,
  FLOAT max_change=0.001, INT max_iter=25, FLOAT damping=0.85, INT top_k=100,
   BOOL print_accum = TRUE, STRING result_attr =  "", STRING file_path = "",
   BOOL display_edges = FALSE)

Parameters

Parameter Description Default

STRING v_type

Names of vertex types to use

(empty string)

STRING e_type

Names of edge types to use

(empty string)

STRING wt_attr

Name of edge weight attribute

(empty string)

FLOAT max_change

PageRank will stop iterating when the largest difference between any vertex’s current score and its previous score ≤ max_change. That is, the scores have become very stable and are changing by less than max_change from one iteration to the next.

0.001

INT max_iter

Maximum number of iterations.

25

FLOAT damping

Fraction of score that is due to the score of neighbors. The balance (1 - damping) is a minimum baseline score that every vertex receives.

0.85

INT top_k

Sort the scores highest first and output only this many scores

100

BOOL print_accum

If True, output JSON to standard output

True

STRING result_attr

If not empty, store PageRank values in FLOAT format to this vertex attribute

(empty string)

STRING file_path+

If not empty, write output to this file.

(empty string)

BOOL display_edges

If true, include the graph’s edges in the JSON output, so that the full graph can be displayed.

False

Output

Computes a weighted PageRank value (FLOAT type) for each vertex.

The result size is equal to \$V\$, the number of vertices in the graph, because a value is computed for every vertex.

Time complexity

This algorithm has a time complexity of \$O(E*k)\$ where \$E\$ is the number of edges and \$k\$ is the number of iterations.

The number of iterations is data-dependent, but the user can set a maximum. Parallel processing reduces the time needed for computation.