The influence maximization problem is defined by identifying the seed set that has the most influence on other users in the network, which when selected, the cascading process reaches a large number of users. We use a greedy algorithm and an epsilon-greedy algorithm from the MAB models in this work, unlike prior works that used the MAB models to quantify the unknown propagation probability in the diffusion models. In this paper, we did not also make any assumption regarding the diffusion models and tries to learn to identify the most influential users based on designed reward function ''hybrid edge strength-similarity'' using global centrality measures and by trying to find a tradeoff between exploitation and exploration strategies. The new proposed reward function initializes the MAB algorithms using global characteristics that quantify the strength of each arm (edge). The proposed reward will feed algorithms from MAB models uses hybridization of edge betweenness centrality and Jaccard similarity measures with some level of participation of each measure. Then, three algorithms are proposed for the extraction of relevant influencers, namely: SRI-CGSS FEXPL-GREEDY) algorithm which almost exploiting the best arm; the SRI_CGSS FEXPR-GREEDY which is almost exploring; and the SRI-MAB-GREEDY algorithm that alternate between exploring and exploiting the best arms. We conduct extensive experiments on a large-scale graph in terms of influence spread, efficiency performance in terms of running time and space complexity, and how the reward parameters impact cumulative regret.
Influence maximization in the social network becomes increasingly important due to its various benefit and application in diverse areas. In this paper, we propose DERND D-hops that adapt the radius-neighborhood degree to a directed graph which is an improvement of our previous algorithm RND d-hops. Then, we propose UERND D-hops algorithm for the undirected graph which is based on radius-neighborhood degree metric for selection of top-K influential users by improving the selection process of our previous algorithm RND d-hops. We set up in the two algorithms a selection threshold value that depends on structural properties of each graph data and thus improves significantly the selection process of seed set, and use a multi-hops distance to select most influential users with a distinct range of influence. We then, determine a multihops distance in which each consecutive seed set should be chosen. Thus, we measure the influence spread of selected seed set performed by our algorithms and existing approaches on two diffusion models. We, therefore, propose an analysis of time complexity of the proposed algorithms and show its worst time complexity. Experimental results on large scale data of our proposed algorithms demonstrate its performance against existing algorithms in term of influence spread within a less time compared with our previous algorithm RND d-hops thanks to a selection threshold value.
Link prediction is an important problem in network data mining, which is dedicated to predicting the potential relationship between nodes in the network. Normally, network link prediction based on supervised classification will be trained on a dataset consisting of a set of positive samples and a set of negative samples. However, well-labeled training datasets with positive and negative annotations are always inadequate in real-world scenarios, and the datasets contain a large number of unlabeled samples that may hinder the performance of the model. To address this problem, we propose a positive-unlabeled learning framework with network representation for network link prediction only using positive samples and unlabeled samples. We first learn representation vectors of nodes using a network representation method. Next, we concatenate representation vectors of node pairs and then feed them into different classifiers to predict whether the link exists or not. To alleviate data imbalance and enhance the prediction precision, we adopt three types of positive-unlabeled (PU) learning strategies to improve the prediction performance using traditional classifier estimation, bagging strategy and reliable negative sampling. We conduct experiments on three datasets to compare different PU learning methods and discuss their influence on the prediction results. The experimental results demonstrate that PU learning has a positive impact on predictive performances and the promotion effects vary with different network structures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.