Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network. Solutions to this problem are important for strategic decisions in marketing and political campaigns. The typical setting consists in the identification of small sets of initial spreaders in very large networks. This setting makes the optimization problem computationally infeasible for standard greedy optimization algorithms that account simultaneously for information about network topology and spreading dynamics, leaving space only to heuristic methods based on the drastic approximation of relying on the geometry of the network alone. The literature on the subject is plenty of purely topological methods for the identification of influential spreaders in networks. However, it is unclear how far these methods are from being optimal. Here, we perform a systematic test of the performance of a multitude of heuristic methods for the identification of influential spreaders. We quantify the performance of the various methods on a corpus of 100 real-world networks; the corpus consists of networks small enough for the application of greedy optimization so that results from this algorithm are used as the baseline needed for the analysis of the performance of the other methods on the same corpus of networks. We find that relatively simple network metrics, such as adaptive degree or closeness centralities, are able to achieve performances very close to the baseline value, thus providing good support for the use of these metrics in large-scale problem settings. Also, we show that a further 2–5% improvement towards the baseline performance is achievable by hybrid algorithms that combine two or more topological metrics together. This final result is validated on a small collection of large graphs where greedy optimization is not applicable.
In this study, the problem of seed selection is investigated. This problem is mainly treated as an optimization problem, which is proved to be NP-hard. There are several heuristic approaches in the literature which mostly use algorithmic heuristics. These approaches mainly focus on the trade-off between computational complexity and accuracy. Although the accuracy of algorithmic heuristics are high, they also have high computational complexity. Furthermore, in the literature, it is generally assumed that complete information on the structure and features of a network is available, which is not the case in most of the times. For the study, a simulation model is constructed, which is capable of creating networks, performing seed selection heuristics, and simulating diffusion models. Novel metric-based seed selection heuristics that rely only on partial information are proposed and tested using the simulation model. These heuristics use local information available from nodes in the synthetically created networks. The performances of heuristics are comparatively analyzed on three different network types. The results clearly show that the performance of a heuristic depends on the structure of a network. A heuristic to be used should be selected after investigating the properties of the network at hand. More importantly, the approach of partial information provided promising results. In certain cases, selection heuristics that rely only on partial network information perform very close to similar heuristics that require complete network data.
We consider the problem of identifying the most influential nodes for a spreading process on a network when prior knowledge about structure and dynamics of the system is incomplete or erroneous. Specifically, we perform a numerical analysis where the set of top spreaders is determined on the basis of prior information that is artificially altered by a certain level of noise. We then measure the optimality of the chosen set by measuring its spreading impact in the true system. Whereas we find that the identification of top spreaders is optimal when prior knowledge is complete and free of mistakes, we also find that the quality of the top spreaders identified using noisy information doesn't necessarily decrease as the noise level increases. For instance, we show that it is generally possible to compensate for erroneous information about dynamical parameters by adding synthetic errors in the structure of the network. Further, we show that, in some dynamical regimes, even completely losing prior knowledge on network structure may be better than relying on certain but incomplete information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.