Galileo Namata scite author profile

Numerous real-world applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such data. In this report, we attempt to provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.

show abstract

Leave-One-Out Cross-Validation

Webb¹,

Sammut²,

Perlich³

et al. 2011

103

View full text Add to dashboard Cite

Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification

Somasundaran

Namata

Wiebe

et al. 2009

View full text Add to dashboard Cite

This work investigates design choices in modeling a discourse scheme for improving opinion polarity classification. For this, two diverse global inference paradigms are used: a supervised collective classification framework and an unsupervised optimization framework. Both approaches perform substantially better than baseline approaches, establishing the efficacy of the methods and the underlying discourse scheme. We also present quantitative and qualitative analyses showing how the improvements are achieved.

show abstract

Collective graph identification

Namata

Kok

Getoor

2011

View full text Add to dashboard Cite

Data describing networks (communication networks, transaction networks, disease transmission networks, collaboration networks, etc.) is becoming increasingly ubiquitous. While this observational data is useful, it often only hints at the actual underlying social or technological structures which give rise to the interactions. For example, an email communication network provides useful insight but is not the same as the "real" social network among individuals. In this paper, we introduce the problem of graph identification, i.e., the discovery of the true graph structure underlying an observed network. We cast the problem as a probabilistic inference task, in which we must infer the nodes, edges, and node labels of a hidden graph, based on evidence provided by the observed network. This in turn corresponds to the problems of performing entity resolution, link prediction, and node labeling to infer the hidden graph. While each of these problems have been studied separately, they have never been considered together as a coherent task. We present a simple yet novel approach to address all three problems simultaneously. Our approach, called C 3 , consists of Coupled Collective Classifiers that are iteratively applied to propagate information among solutions to the problems. We empirically demonstrate that C 3 is superior, in terms of both predictive accuracy and runtime, to state-of-the-art probabilistic approaches on three real-world problems.

show abstract

Name Reference Resolution in Organizational Email Archives

Diehl¹,

Getoor²,

Namata³

2006

View full text Add to dashboard Cite

A dual-view approach to interactive network visualization

Namata

Staats

Getoor

et al. 2007

View full text Add to dashboard Cite

Visualizing network data, from tree structures to arbitrarily connected graphs, is a difficult problem in information visualization. A large part of the problem is that in network data, users not only have to visualize the attributes specific to each data item, but also the links specifying how those items are connected to each other. Past approaches to resolving these difficulties focus on zooming, clustering, filtering and applying various methods of laying out nodes and edges. Such approaches, however, focus only on optimizing a network visualization in a single view, limiting the amount of information that can be shown and explored in parallel. Moreover, past approaches do not allow users to cross reference different subsets or aspects of large, complex networks. In this paper, we propose an approach to these limitations using multiple coordinated views of a given network. To illustrate our approach, we implement a tool called DualNet and evaluate the tool with a case study using an email communication network. We show how using multiple coordinated views improves navigation and provides insight into large networks with multiple node and link properties and types.

show abstract

Confusion Matrix

Shultz¹,

Fahlman²,

Craw³

et al. 2011

View full text Add to dashboard Cite

Declarative analysis of noisy information networks

Moustafa

Namata

Deshpande

et al. 2011

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Galileo Namata

Collective Classification in Network Data

Leave-One-Out Cross-Validation

Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification

Collective graph identification

Name Reference Resolution in Organizational Email Archives

A dual-view approach to interactive network visualization

Confusion Matrix

Declarative analysis of noisy information networks

Contact Info

Product

Resources

About