Tina Eliassi-Rad scite author profile

Numerous real-world applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such data. In this report, we attempt to provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.

show abstract

RolX: Role Extraction and Mining in Large Networks

Henderson¹,

Gallagher²,

Eliassi-Rad³

et al. 2011

204

287

View full text Add to dashboard Cite

Abstract-Given a graph, how can we automatically discover roles for nodes? Roles could be, eg., 'bridges', or 'peripherynodes', etc. Roles are compact summaries of a node's behavior that generalize across networks. They enable numerous novel and useful network mining tasks, such as sense-making, searching for similar nodes, and node classification. We propose RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting roles from general network data. We demonstrate the effectiveness of RolX on several network mining tasks, from exploratory data analysis to network transfer learning. Moreover, we compare network role discovery with network community discovery. We highlight fundamental differences between the two (e.g., roles generalize across disconnected networks, communities do not).

show abstract

Fast best-effort pattern matching in large attributed graphs

et al. 2007

View full text Add to dashboard Cite

We focus on large graphs where nodes have attributes, such as a social network where the nodes are labelled with each person's job title. In such a setting, we want to find subgraphs that match a user query pattern. For example, a 'star' query would be, "find a CEO who has strong interactions with a Manager, a Lawyer, and an Accountant, or another structure as close to that as possible". Similarly, a 'loop' query could help spot a money laundering ring.Traditional SQL-based methods, as well as more recent graph indexing methods, will return no answer when an exact match does not exist. Our method can find exact-, as well as near-matches, and it will present them to the user in our proposed 'goodness' order. For example, our method tolerates indirect paths between, say, the 'CEO' and the 'Accountant' of the above sample query, when direct paths do not exist. Its second feature is scalability. In general, if the query has nq nodes and the data graph has n nodes, the problem needs polynomial time complexity O(n nq ), which is prohibitive. Our G-Ray ("Graph X-Ray") method finds high-quality subgraphs in time linear on the size of the data graph.Experimental results on the DLBP author-publication graph (with 356K nodes and 1.9M edges) illustrate both the effectiveness and scalability of our approach. The results agree with our intuition, and the speed is excellent. It takes 4 seconds on average for a 4-node query on the DBLP graph.

show abstract

APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions

Vlasselaer

Bravo

Caelen³

et al. 2015

Decision Support Systems

263

142

View full text Add to dashboard Cite

In the last decade, the ease of online payment has opened up many new opportunities for e-commerce, lowering the geographical boundaries for retail. While e-commerce is still gaining popularity, it is also the playground of fraudsters who try to misuse the transparency of online purchases and the transfer of credit card records. This paper proposes APATE, a novel approach to detect fraudulent credit card transactions conducted in online stores. Our approach combines (1) intrinsic features derived from the characteristics of incoming transactions and the customer spending history using the fundamentals of RFM (Recency -Frequency -Monetary); and (2) network-based features by exploiting the network of credit card holders and merchants and deriving a time-dependent suspiciousness score for each network object. Our results show that both intrinsic and network-based features are two strongly intertwined sides of the same picture. The combination of these two types of features leads to the best performing models which reach AUC-scores higher than 0.98.

show abstract

Gelling, and melting, large graphs by edge manipulation

et al. 2012

View full text Add to dashboard Cite

Controlling the dissemination of an entity (e.g., meme, virus, etc) on a large graph is an interesting problem in many disciplines. Examples include epidemiology, computer security, marketing, etc. So far, previous studies have mostly focused on removing or inoculating nodes to achieve the desired outcome.We shift the problem to the level of edges and ask: which edges should we add or delete in order to speed-up or contain a dissemination? First, we propose effective and scalable algorithms to solve these dissemination problems. Second, we conduct a theoretical study of the two problems and our methods, including the hardness of the problem, the accuracy and complexity of our methods, and the equivalence between the different strategies and problems. Third and lastly, we conduct experiments on real topologies of varying sizes to demonstrate the effectiveness and scalability of our approaches.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tina Eliassi-Rad

Collective Classification in Network Data

RolX: Role Extraction and Mining in Large Networks

Fast best-effort pattern matching in large attributed graphs

APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions

Gelling, and melting, large graphs by edge manipulation

Contact Info

Product

Resources

About