Oishik Chatterjee scite author profile

Oishik Chatterjee

5Publications

29Citation Statements Received

26Citation Statements Given

How they've been cited

How they cite others

Affiliations

Indian Institute of Technology Bombay, Jadavpur University

Publications

Order By: Most citations

Robust Data Programming with Precision-guided Labeling Functions

Chatterjee

Ramakrishnan

Sarawagi

2020

AAAI

View full text Add to dashboard Cite

Scarcity of labeled data is a bottleneck for supervised learning models. A paradigm that has evolved for dealing with this problem is data programming. An existing data programming paradigm allows human supervision to be provided as a set of discrete labeling functions (LF) that output possibly noisy labels to input instances and a generative model for consolidating the weak labels. We enhance and generalize this paradigm by supporting functions that output a continuous score (instead of a hard label) that noisily correlates with labels. We show across five applications that continuous LFs are more natural to program and lead to improved recall. We also show that accuracy of existing generative models is unstable with respect to initialization, training epochs, and learning rates. We give control to the data programmer to guide the training process by providing intuitive quality guides with each LF. We propose an elegant method of incorporating these guides into the generative model. Our overall method, called CAGE, makes the data programming paradigm more reliable than other tricks based on initialization, sign-penalties, or soft-accuracy constraints.

show abstract

Stability of Consensus Node Orderings Under Imperfect Network Data

Basu

Maulik

Chatterjee

2016

IEEE Trans. Comput. Soc. Syst.

View full text Add to dashboard Cite

Semi-Supervised Data Programming with Subset Selection

Maheshwari¹,

Chatterjee²,

Killamsetty³

et al. 2021

View full text Add to dashboard Cite

The paradigm of data programming, which uses weak supervision in the form of rules/labelling functions, and semi-supervised learning, which augments small amounts of labelled data with a large unlabelled dataset, have shown great promise in several text classification scenarios. In this work, we argue that by not using any labelled data, data programming based approaches can yield sub-optimal performances, particularly when the labelling functions are noisy. The first contribution of this work is an introduction of a framework, SPEAR which is a semi-supervised data programming paradigm that learns a joint model that effectively uses the rules/labelling functions along with semi-supervised loss functions on the feature space. Next, we also study SPEAR-SS which additionally does subset selection on top of the joint semi-supervised data programming objective and selects a set of examples that can be used as the labelled set by SPEAR. The goal of SPEAR-SS is to ensure that the labelled data can complement the labelling functions, thereby benefiting from both data-programming as well as appropriately selected data for human labelling. We demonstrate that by effectively combining semi-supervision, data-programming, and subset selection paradigms, we significantly outperform the current state-of-the-art on seven publicly available datasets. 1

show abstract

Semi-Supervised Data Programming with Subset Selection

Maheshwari¹,

Chatterjee²,

Killamsetty³

et al. 2020

Preprint

View full text Add to dashboard Cite

A Weakly Supervised Model for Solving Math word Problems

Chatterjee¹,

Pandey²,

Aashish³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.