In this work we present a new approach for co-authorship link prediction based on leveraging information contained in general bibliographical multiplex networks. A multiplex network is a graph defined over a set of nodes linked by different types of relations. For instance, the multiplex network we are studying here is defined as follows : nodes represent authors and links can be one of the following types: co-authorship links, co-venue attending links and co-citing links. A supervised-machine learning based link prediction approach is applied. A link formation model is learned based on a set of topological attributes describing both positive and negative examples. While such an approach has been successfully applied in the context on simple networks, different options can be applied to extend it to multiplex networks. One option is to compute topological attributes in each layer of the multiplex. Another one is to compute directly new multiplex-based attributes quantifying the multiplex nature of dyads (potential links). These different approaches are studied and compared through experiments on real datasets extracted from the bibliographical database DBLP.2010 Mathematics Subject Classification. Primary: 58F15, 58F17; Secondary: 53C35.
This chapter presents the problem of link prediction in complex networks. It provides general description, formal definition of the problem and applications. It gives a state-of-art of various existing link prediction approaches concentrating more on topological approaches. It presents the main challenges of link prediction task in real networks. There is description of our new link prediction approach based on supervised rank aggregation and our attempts to deal with two of the challenges to improve the prediction results. One approach is to extend the set of attributes describing an example (pair of nodes) calculated in a multiplex network that includes the target network. Multiplex networks have a layered structure, each layer having different kinds of links between same sets of nodes. The second way is to use community information for sampling of examples to deal with the problem of class imbalance. All experiments have been conducted on real networks extracted from well-known DBLP bibliographic database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.