Dhruva Sahrawat scite author profile

Dhruva Sahrawat

5Publications

85Citation Statements Received

90Citation Statements Given

How they've been cited

How they cite others

Affiliations

National University of Singapore, Indraprastha Institute of Information Technology Delhi

Publications

Order By: Most citations

Keyphrase Extraction as Sequence Labeling Using Contextualized Embeddings

Sahrawat

Mahata

Zhang

et al. 2020

View full text Add to dashboard Cite

In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings. We evaluate the proposed architecture using both contextualized and fixed word embedding models on three different benchmark datasets (Inspec, SemEval 2010, SemEval 2017), and compare with existing popular unsupervised and supervised techniques. Our results quantify the benefits of: (a) using contextualized embeddings (e.g. BERT) over fixed word embeddings (e.g. Glove); (b) using a BiLSTM-CRF architecture with contextualized word embeddings over fine-tuning the contextualized word embedding model directly; and (c) using genre-specific contextualized embeddings (SciBERT). Through error analysis, we also provide some insights into why particular models work better than the others. Lastly, we present a case study where we analyze different self-attention layers of the two best models (BERT and SciBERT) to better understand the predictions made by each for the task of keyphrase extraction.

show abstract

Harnessing GANs for Zero-Shot Learning of New Classes in Visual Speech Recognition

Kumar¹,

Sahrawat²,

Maheshwari³

et al. 2020

AAAI

View full text Add to dashboard Cite

Visual Speech Recognition (VSR) is the process of recognizing or interpreting speech by watching the lip movements of the speaker. Recent machine learning based approaches model VSR as a classification problem; however, the scarcity of training data leads to error-prone systems with very low accuracies in predicting unseen classes. To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases. We also show that our models are language agnostic and therefore capable of seamlessly generating, using English training data, videos for a new language (Hindi). To the best of our knowledge, this is the first work to show empirical evidence of the use of GANs for generating training samples of unseen classes in the domain of VSR, hence facilitating zero-shot learning. We make the added videos for new classes publicly available along with our code1.

show abstract

Hush-Hush Speak: Speech Reconstruction Using Silent Videos

Uttam¹,

Kumar²,

Sahrawat³

et al. 2019

View full text Add to dashboard Cite

Speech Reconstruction is the task of recreation of speech using silent videos as input. In the literature, it is also referred to as lipreading. In this paper, we design an encoder-decoder architecture which takes silent videos as input and outputs an audio spectrogram of the reconstructed speech. The model, despite being a speaker-independent model, achieves comparable results on speech reconstruction to the current state-of-the-art speaker-dependent model. We also perform user studies to infer speech intelligibility. Additionally, we test the usability of the trained model using bilingual speech.

show abstract

Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer

Goswami¹,

Mehta²,

Sahrawat³

et al. 2020

Preprint

View full text Add to dashboard Cite

Developing nations lack adequate number of hospitals with modern equipment and skilled doctors. Hence, a significant proportion of these nations' population, particularly in rural areas, is not able to avail specialized and timely healthcare facilities. In recent years, deep learning (DL) models, a class of artificial intelligence (AI) methods, have shown impressive results in medical domain. These AI methods can provide immense support to developing nations as affordable healthcare solutions. This work is focused on one such application of blood cancer diagnosis. However, there are some challenges to DL models in cancer research because of the unavailability of a large data for adequate training and the difficulty of capturing heterogeneity in data at different levels ranging from acquisition characteristics, session, to subject-level (within subjects and across subjects). These challenges render DL models prone to overfitting and hence, models lack generalization on prospective subjects' data. In this work, we address these problems in the application of B-cell Acute Lymphoblastic Leukemia (B-ALL) diagnosis using deep learning. We propose heterogeneity loss that captures subject-level heterogeneity, thereby, forcing the neural network to learn subject-independent features. We also propose an unorthodox ensemble strategy that helps us in providing improved classification over models trained on 7-folds giving a weighted-F 1 score of 95.26% on unseen (test) subjects' data that are, so far, the best results on the C-NMC 2019 dataset for B-ALL classification.

show abstract

A Multi-task Learning Framework for Road Attribute Updating via Joint Analysis of Map Data and GPS Traces

Yin

Varadarajan

Wang

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dhruva Sahrawat

Keyphrase Extraction as Sequence Labeling Using Contextualized Embeddings

Harnessing GANs for Zero-Shot Learning of New Classes in Visual Speech Recognition

Hush-Hush Speak: Speech Reconstruction Using Silent Videos

Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer

A Multi-task Learning Framework for Road Attribute Updating via Joint Analysis of Map Data and GPS Traces

Contact Info

Product

Resources

About