Computing personalized PageRank quickly by exploiting graph structures

Maehara, Takanori; Akiba, Takuya; Iwata, Y.; Kawarabayashi, Ken-ichi

doi:10.14778/2732977.2732978

Cited by 57 publications

(33 citation statements)

References 44 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, considerable efforts [13,27,32,34,37,45] have been made to investigate algorithms for single-source PPR queries. The methods proposed are mostly built upon the power method [34], which is a matrix-based iterative algorithm that can answer singlesource PPR queries with any given threshold ϵ a on the absolute errors of PPR estimations.…”

Section: Other Related Workmentioning

confidence: 99%

See 1 more Smart Citation

TopPPR

Wei

Xiao

et al. 2018

Proceedings of the 2018 International Conference on Management of Data

View full text Add to dashboard Cite

Section: Other Related Workmentioning

confidence: 99%

“…In particular, matrix-based methods [13,15,27,32,37,45] formulate PPR as the solution to a linear system, and they apply matrix optimization approaches to reduce query costs. Local update methods [7, 8, 17-20, 26, 41], on the other hand, utilize graph traversals instead of matrix operations for PPR computation.…”

Section: Introductionmentioning

confidence: 99%

TopPPR

Wei

Xiao

et al. 2018

Proceedings of the 2018 International Conference on Management of Data

View full text Add to dashboard Cite

“…Naturally, there is a trade-off between the number of partitions created for the input graph G and the accuracy: the higher the number of partitions, the faster the runtime execution (and smaller the memory requirement), but the higher the drop in accuracy. Recently, [31] proposed a fast Personalized PageRank algorithm: firstly the graph is decomposed into two parts: a core part which behaves like an expander graph and thus making the convergence of an iterative method very fast, and an almost a tree part. Authors suggest to rely on LU decomposition [18], which is a matrix factorization of the form A = LU, where L is a lower triangular matrix with unit diagonals and U is an upper triangular matrix.…”

Section: Obtaining Pagerank and Personalized Pagerank Scoresmentioning

confidence: 99%

“…Thanks to the possibility of converting (as discussed in Section 4.3.1) a given seed-set maximal PPR computation task from an expensive optimization problem into a set of linear equations of the form π i = (1−β)T G π i +βs i , the proposed RPR measures can also easily leverage other random-walks with restart approximation techniques, such as [31]: one only needs to replace the computation of π i with the selected approximate PPR technique to obtain an approximation of π i . Most importantly, though, the experimental results show that teleportation-discounting not only increases robustness of PPR scores against noise in the seed set S but also significantly improves robustness against noise introduced due to approximate computation.…”

Section: Approximate Rpr With Fast Random Walkmentioning

confidence: 99%

Reducing seed noise in personalized PageRank

Huang

Candan

et al. 2016

Soc. Netw. Anal. Min.

View full text Add to dashboard Cite

Network based recommendation systems leverage the topology of the underlying graph and the current user context to rank objects in the database. Random-walk based techniques, such as PageRank, encode the structure of the graph in the form of a transition matrix of a stochastic process from which the significances of the nodes in the graph are inferred. Personalized PageRank (PPR) techniques complement this with a seed node set which serves as the personalization context. In this paper, we note (and experimentally show) that PPR algorithms that do not differentiate among the seed nodes may not properly rank nodes in situations where the seed set is incomplete and/or noisy. To tackle this problem, we propose alternative robust personalized PageRank (RPR) strategies, which are insensitive to noise in the set of seed nodes and in which the rankings are not overly biased towards the seed nodes. In particular, we show that novel teleportation discounting and seed-set maximal PPR techniques help eliminate harmful bias of individual seed nodes and provide effective seed differentiation to lead to more accurate rankings. We also show that the proposed techniques lead to efficient implementations, where existing approximation algorithms and/or parallel implementations for computing the PPR scores can be easily leveraged. Moreover, the proposed formulations are reuse-promoting in the sense that, it is possible to divide the work relative to individual seed nodes and cache the intermediary results obtained during the computation, and especially in systems with large query throughputs, it may be possible to cluster queries based on the partial overlaps between the seed sets and reduce the overall robust PPR computation costs. Experiment results show that the proposed techniques are efficient and highly effective in improving recommendations and eliminating unwanted bias due to imperfections in the seed set.

show abstract

“…Path-length based definitions, such as those used by Palmer et al (2006), Boldi et al (2011), Cohen et al (2003), Wei (2010), Xiao et al (2009), Zhou et al (2009) , are 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 useful when the relatedness can be captured solely based on the properties of the nodes and edges on the shortest path (based on some definition of path-length). Randomwalk based definitions, such as hitting distance (Chen et al, 2008;Mei et al, 2008) and personalized PageRank (PPR) score (Balmin et al, 2004;Chakrabarti, 2007;Jeh and Widom, 2002;Tong et al, 2006a;Tong et al, 2007;Liu et al, 2013;Lofgren et al, 2014;Maehara et al, 2014), of node relatedness, on the other hand, also take into account the density of the edges: intuitively, as in path-length based definitions, a node can be said to be more related to another node if there are short paths between them; however, unlike in path-based definitions, random walk-based definitions of relatedness also c...…”

Section: Introductionmentioning

confidence: 99%

Locality-sensitive and Re-use Promoting Personalized PageRank computations

2015

View full text Add to dashboard Cite

Abstract:Node distance/proximity measures are used for quantifying how nearby or otherwise related two or more nodes on a graph are. In particular, personalized PageRank (PPR) based measures of node proximity have been shown to be highly effective in many prediction and recommendation applications. Despite its effectiveness, however, the use of personalized PageRank for large graphs is difficult due to its high computation cost. In this paper, we propose a Locality-sensitive, Re-use promoting, approximate Personalized PageRank (LR-PPR) algorithm for efficiently computing the PPR values relying on the localities of the given seed nodes on the graph: (a) The LR-PPR algorithm is locality sensitive in the sense that it reduces the computational cost of the PPR computation process by focusing on the local neighborhoods of the seed nodes.(b) LR-PPR is re-use promoting in that instead of performing a monolithic computation for the given seed node set using the entire graph, LR-PPR divides the work into localities of the seeds and caches the intermediary results obtained during the computation. These cached results are then reused for future queries sharing seed nodes. Experiment results for different data sets and under different scenarios show that LR-PPR algorithm is highly-efficient and accurate. Abstract. Node distance/proximity measures are used for quantifying how nearby or otherwise related two or more nodes on a graph are. In particular, personalized PageRank (PPR) based measures of node proximity have been shown to be highly effective in many prediction and recommendation applications. Despite its effectiveness, however, the use of personalized PageRank for large graphs is difficult due to its high computation cost. In this paper, we propose a Localitysensitive, Re-use promoting, approximate Personalized PageRank (LR-PPR) algorithm for efficiently computing the PPR values relying on the localities of the given seed nodes on the graph: (a) The LR-PPR algorithm is locality sensitive in the sense that it reduces the computational cost of the PPR computation process by focusing on the local neighborhoods of the seed nodes. (b) LR-PPR is re-use promoting in that instead of performing a monolithic computation for the given seed node set using the entire graph, LR-PPR divides the work into localities of the seeds and caches the intermediary results obtained during the computation. These cached results are then reused for future queries sharing seed nodes. Experiment results for different data sets and under different scenarios show that LR-PPR algorithm is highly-efficient and accurate. Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation

show abstract

Computing personalized PageRank quickly by exploiting graph structures

Cited by 57 publications

References 44 publications

TopPPR

TopPPR

Reducing seed noise in personalized PageRank

Locality-sensitive and Re-use Promoting Personalized PageRank computations

Contact Info

Product

Resources

About