Tensors or {\em multi-way arrays} are functions of three or more indices $(i,j,k,\cdots)$ -- similar to matrices (two-way arrays), which are functions of two indices $(r,c)$ for (row,column). Tensors have a rich history, stretching over almost a century, and touching upon numerous disciplines; but they have only recently become ubiquitous in signal and data analytics at the confluence of signal processing, statistics, data mining and machine learning. This overview article aims to provide a good starting point for researchers and practitioners interested in learning about and working with tensors. As such, it focuses on fundamentals and motivation (using various application examples), aiming to strike an appropriate balance of breadth {\em and depth} that will enable someone having taken first graduate courses in matrix algebra and probability to get started doing research and/or developing tensor algorithms and software. Some background in applied optimization is useful but not strictly required. The material covered includes tensor rank and rank decomposition; basic tensor factorization models and their relationships and properties (including fairly good coverage of identifiability); broad coverage of algorithms ranging from alternating optimization to stochastic gradient; statistical performance analysis; and applications ranging from source separation to collaborative filtering, mixture and topic modeling, classification, and multilinear subspace learning.Comment: revised version, overview articl
Abstract-Given a graph, how can we automatically discover roles for nodes? Roles could be, eg., 'bridges', or 'peripherynodes', etc. Roles are compact summaries of a node's behavior that generalize across networks. They enable numerous novel and useful network mining tasks, such as sense-making, searching for similar nodes, and node classification. We propose RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting roles from general network data. We demonstrate the effectiveness of RolX on several network mining tasks, from exploratory data analysis to network transfer learning. Moreover, we compare network role discovery with network community discovery. We highlight fundamental differences between the two (e.g., roles generalize across disconnected networks, communities do not).
Despite the apparent randomness of the Internet, we discover some surprisingly simple power-laws of the Internet topology. These power-laws hold for three snapshots of the Internet, between November 1997 and December 1998, despite a 45% growth of its size during that period. We show that our power-laws fit the real data very well resulting in correlation coefficients of 96% or higher.Our observations provide a novel perspective of the structure of the Internet.The power-laws describe concisely skewed distributions of graph properties such as the node outdegree.In addition, these power-laws can be used to estimate important parameters such as the average neighborhood size, and facilitate the design and the performance analysis of protocols. Furthermore, we can use them to generate and select realistic topologies for simulation purposes.
The problem of deploying sensors in a large water distribution network is considered, in order to detect the malicious introduction of contaminants. It is shown that a large class of realistic objective functions-such as reduction of detection time and the population protected from consuming contaminated water-exhibits an important diminishing returns effect called submodularity. The submodularity of these objectives is exploited in order to design efficient placement algorithms with provable performance guarantees. The algorithms presented in this paper do not rely on mixed integer programming, and scale well to networks of arbitrary size. The problem instances considered in the approach presented in this paper are orders of magnitude ͑a factor of 72͒ larger than the largest problems solved in the literature. It is shown how the method presented here can be extended to multicriteria optimization, selecting placements robust to sensor failures and optimizing minimax criteria. Extensive empirical evidence on the effectiveness of the method presented in this paper on two benchmark distribution networks, and an actual drinking water distribution system of greater than 21,000 nodes, is presented.
Weighted signed networks (WSNs) are networks in which edges are labeled with positive and negative weights. WSNs can capture like/dislike, trust/distrust, and other social relationships between people. In this paper, we consider the problem of predicting the weights of edges in such networks. We propose two novel measures of node behavior: the goodness of a node intuitively captures how much this node is liked/trusted by other nodes, while the fairness of a node captures how fair the node is in rating other nodes\u27 likeability or trust level. We provide axioms that these two notions need to satisfy and show that past work does not meet these requirements for WSNs. We provide a mutually recursive definition of these two concepts and prove that they converge to a unique solution in linear time. We use the two measures to predict the edge weight in WSNs. Furthermore, we show that when compared against several individual algorithms from both the signed and unsigned social network literature, our fairness and goodness metrics almost always have the best predictive power. We then use these as features in different multiple regression models and show that we can predict edge weights on 2 Bitcoin WSNs, an Epinions WSN, 2 WSNs derived from Wikipedia, and a WSN derived from Twitter with more accurate results than past work. Moreover, fairness and goodness metrics form the most significant feature for prediction in most (but not all) cases
Anomaly detection is an important problem with multiple applications, and thus has been studied for decades in various research domains. In the past decade there has been a growing interest in anomaly detection in data represented as networks, or graphs, largely because of their robust expressiveness and their natural ability to represent complex relationships. Originally, techniques focused on anomaly detection in static graphs, which do not change and are capable of representing only a single snapshot of data. As real-world networks are constantly changing, there has been a shift in focus to dynamic graphs, which evolve over time.In this survey, we aim to provide a comprehensive overview of anomaly detection in dynamic networks, concentrating on the state-of-the-art methods. We first describe four types of anomalies that arise in dynamic networks, providing an intuitive explanation, applications, and a concrete example for each. Having established an idea for what constitutes an anomaly, a general two-stage approach to anomaly detection in dynamic networks that is common among the methods is presented. We then construct a two-tiered taxonomy, first partitioning the methods based on the intuition behind their approach, and subsequently subdividing them based on the types of anomalies they detect. Within each of the tier one categories-community, compression, decomposition, distance, and probabilistic model based-we highlight the major similarities and differences, showing the wealth of techniques derived from similar conceptual approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.