Given a social network G and a positive integer k, the influence maximization problem asks for k nodes (in G) whose adoptions of a certain idea or product can trigger the largest expected number of follow-up adoptions by the remaining nodes. This problem has been extensively studied in the literature, and the state-of-the-art technique runs in O((k + ℓ)(n + m) log n/ε 2 ) expected time and returns a (1 − 1/e − ε)-approximate solution with at least 1 − 1/n ℓ probability.This paper presents an influence maximization algorithm that provides the same worst-case guarantees as the state of the art, but offers significantly improved empirical efficiency. The core of our algorithm is a set of estimation techniques based on martingales, a classic statistical tool. Those techniques not only provide accurate results with small computation overheads, but also enable our algorithm to support a larger class of information diffusion models than existing methods do. We experimentally evaluate our algorithm against the states of the art under several popular diffusion models, using real social networks with up to 1.4 billion edges. Our experimental results show that the proposed algorithm consistently outperforms the states of the art in terms of computation efficiency, and is often orders of magnitude faster.
Given two locations s and t in a road network, a distance query returns the minimum network distance from s to t, while a shortest path query computes the actual route that achieves the minimum distance. These two types of queries find important applications in practice, and a plethora of solutions have been proposed in past few decades. The existing solutions, however, are optimized for either practical or asymptotic performance, but not both. In particular, the techniques with enhanced practical efficiency are mostly heuristicbased, and they offer unattractive worst-case guarantees in terms of space and time. On the other hand, the methods that are worst-case efficient often entail prohibitive preprocessing or space overheads, which render them inapplicable for the large road networks (with millions of nodes) commonly used in modern map applications. This paper presents Arterial Hierarchy (AH), an index structure that narrows the gap between theory and practice in answering shortest path and distance queries on road networks. On the theoretical side, we show that, under a realistic assumption, AH answers any distance query inÕ(log α) time, where α = dmax/dmin, and dmax (resp. dmin) is the largest (resp. smallest) L∞ distance between any two nodes in the road network. In addition, any shortest path query can be answered inÕ(k + log α) time, where k is the number of nodes on the shortest path. On the practical side, we experimentally evaluate AH on a large set of real road networks with up to twenty million nodes, and we demonstrate that (i) AH outperforms the state of the art in terms of query time, and (ii) its space and pre-computation overheads are moderate.
Modern mobile technology has enabled the collection of large scale vehicle trajectories using GPS devices. As GPS measurements may come with error, vehicle trajectories are often noisy. A common practice to alleviate this issue is to apply map-matching, i.e., to align vehicle trajectories with the road segments in a digitized road network. This paper presents an efficient solution for map-matching problem that won the SIGSPATIAL CUP 2012. Given a road network, our solution first constructs a gird index on the road segments. For each point p on a vehicle trajectory, we employ the index to identify a candidate set of road segments that are close to p, and then we refine the candidate set to select a segment that matches p with the highest probability. The selection of the best match is based on a metric that takes into account (i) the correlation between consecutive GPS measurements as well as (ii) the directions and shapes of the road segments. Experimental results on real vehicle trajectories and road networks demonstrate the effectiveness and efficiency of the proposed solution.
Given a social network G and a constant k, the influence maximization problem asks for k nodes in G that (directly and indirectly) influence the largest number of nodes under a pre-defined diffusion model. This problem finds important applications in viral marketing, and has been extensively studied in the literature. Existing algorithms for influence maximization, however, either trade approximation guarantees for practical efficiency, or vice versa. In particular, among the algorithms that achieve constant factor approximations under the prominent independent cascade (IC) model or linear threshold (LT) model, none can handle a million-node graph without incurring prohibitive overheads. This paper presents TIM, an algorithm that aims to bridge the theory and practice in influence maximization. On the theory side, we show that TIM runs in O((k + )(n + m) log n/ε 2 ) expected time and returns a (1 − 1/e − ε)-approximate solution with at least 1 − n − probability. The time complexity of TIM is nearoptimal under the IC model, as it is only a log n factor larger than the Ω(m + n) lower-bound established in previous work (for fixed k, , and ε). Moreover, TIM supports the triggering model, which is a general diffusion model that includes both IC and LT as special cases. On the practice side, TIM incorporates novel heuristics that significantly improve its empirical efficiency without compromising its asymptotic performance. We experimentally evaluate TIM with the largest datasets ever tested in the literature, and show that it outperforms the state-of-the-art solutions (with approximation guarantees) by up to four orders of magnitude in terms of running time. In particular, when k = 50, ε = 0.2, and = 1, TIM requires less than one hour on a commodity machine to process a network with 41.6 million nodes and 1.4 billion edges. This demonstrates that influence maximization algorithms can be made practical while still offering strong theoretical guarantees.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.