Sreeram Balakrishnan scite author profile

Sreeram Balakrishnan

5Publications

50Citation Statements Received

26Citation Statements Given

How they've been cited

How they cite others

Affiliations

IBM Research - India, Silicon Valley University

Publications

Order By: Most citations

Entity annotation based on inverse index operations

Ramakrishnan

Balakrishnan

Joshi

2006

View full text Add to dashboard Cite

Entity annotation involves attaching a label such as 'name' or 'organization' to a sequence of tokens in a document. All the current rule-based and machine learningbased approaches for this task operate at the document level. We present a new and generic approach to entity annotation which uses the inverse index typically created for rapid keyword based searching of a document collection. We define a set of operations on the inverse index that allows us to create annotations defined by cascading regular expressions. The entity annotations for an entire document corpus can be created purely of the index with no need to access the original documents. Experiments on two publicly available data sets show very significant performance improvements over the documentbased annotators.

show abstract

Using ILP to Construct Features for Information Extraction from Semi-structured Text

Ramakrishnan

Joshi

Balakrishnan

et al.

View full text Add to dashboard Cite

Abstract. Machine-generated documents containing semi-structured text are rapidly forming the bulk of data being stored in an organisation. Given a feature-based representation of such data, methods like SVMs are able to construct good models for information extraction (IE). But how are the featuredefinitions to be obtained in the first place? (We are referring here to the representation problem: selecting good features from the ones defined comes later.) So far, features have been defined manually or by using special-purpose programs: neither approach scaling well to handle the heterogeneity of the data or new domain-specific information. We suggest that Inductive Logic Programming (ILP) could assist in this. Specifically, we demonstrate the use of ILP to define features for seven IE tasks using two disparate sources of information. Our findings are as follows: (1) the ILP system is able to identify efficiently large numbers of good features. Typically, the time taken to identify the features is comparable to the time taken to construct the predictive model; and (2) SVM models constructed with these ILP-features are better than the best reported to date that rely heavily on hand-crafted features. For the ILP practioneer, we also present evidence supporting the claim that, for IE tasks, using an ILP system to assist in constructing an extensional representation of text data (in the form of features and their values) is better than using it to construct intensional models for the tasks (in the form of rules for information extraction).

show abstract

A Conversation-Mining System for Gathering Insights to Improve Agent Productivity

Takeuchi

Subramaniam

Nasukawa

et al. 2007

View full text Add to dashboard Cite

We describe a method to analyze transcripts of conversations between customers and agents in a contact center. The aim is to obtain actionable insights from the conversations to improve agent performance. Our approach has three steps. First we segment the call into logical parts. Next we extract relevant phrases within different segments. Finally we do two dimensional association analysis to identify actionable trends. We use real data from a contact center to identify specific actions by agents that result in positive outcomes. We also show that implementing the actionable results in improved agent productivity.

show abstract

Asynchronous HMM with applications to speech recognition

Garg

Balakrishnan

Vaithyanathan

View full text Add to dashboard Cite

We develop a novel formalism for modeling speech signals which are irregularly or incompletely sampled. This situation can arise in real world applications where the speech signal is being transmitted over an error prone channel where parts of the signal can be dropped. Typical speech systems based on Hidden Markov Models, cannot handle such data since HMMs rely on the assumption that observations are complete and made at regular intervals. In this paper we introduce the asynchronous HMM, a variant of the inhomogenous HMM commonly used in Bioinformatics, and show how it can be used to model irregularly or incompletely sampled data. A nested EM algorithm is presented in brief which can be used to learn the parameters of this asynchronous HMM. Evaluation on real world speech data that has been modified to simulate channel errors, shows that this model and its variants significantly outperforms the standard HMM and methods based on data interpolation.

show abstract

The use of confidence measures in unsupervised adaptation of speech recognizers

Anastasakos¹,

Balakrishnan²

1998

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.