Kumar Shridhar scite author profile

Kumar Shridhar

12Publications

58Citation Statements Received

263Citation Statements Given

How they've been cited

How they cite others

204

260

Affiliations

ETH Zurich, University of Kaiserslautern

Publications

Order By: Most citations

Subword Semantic Hashing for Intent Classification on Small Datasets

Shridhar

Dash

Sahu

et al. 2019

View full text Add to dashboard Cite

In this paper, we introduce the use of Semantic Hashing as embedding for the task of Intent Classification and achieve state-of-the-art performance on three frequently used benchmarks. Intent Classification on a small dataset is a challenging task for data-hungry state-of-the-art Deep Learning based systems. Semantic Hashing is an attempt to overcome such a challenge and learn robust text classification. Current word embedding based methods [11], [13], [14] are dependent on vocabularies. One of the major drawbacks of such methods is out-of-vocabulary terms, especially when having small training datasets and using a wider vocabulary. This is the case in Intent Classification for chatbots, where typically small datasets are extracted from internet communication. Two problems arise with the use of internet communication. First, such datasets miss a lot of terms in the vocabulary to use word embeddings efficiently. Second, users frequently make spelling errors. Typically, the models for intent classification are not trained with spelling errors and it is difficult to think about ways in which users will make mistakes. Models depending on a word vocabulary will always face such issues. An ideal classifier should handle spelling errors inherently. With Semantic Hashing, we overcome these challenges and achieve state-of-the-art results on three datasets: Chatbot, Ask Ubuntu, and Web Applications [3]. Our benchmarks are available online. 1

show abstract

End to End Binarized Neural Networks for Text Classification

Shridhar¹,

Jain²,

Agarwal³

et al. 2020

View full text Add to dashboard Cite

Deep neural networks have demonstrated their superior performance in almost every Natural Language Processing task, however, their increasing complexity raises concerns. A particular concern is that these networks pose high requirements for computing hardware and training budgets. The state-of-the-art transformer models are a vivid example. Simplifying the computations performed by a network is one way of addressing the issue of the increasing complexity. In this paper, we propose an end to end binarized neural network for the task of intent and text classification. In order to fully utilize the potential of end to end binarization, both the input representations (vector embeddings of tokens statistics) and the classifier are binarized. We demonstrate the efficiency of such a network on the intent classification of short texts over three datasets and text classification with a larger dataset. On the considered datasets, the proposed network achieves comparable to the state-of-the-art results while utilizing ∼ 20-40% lesser memory and training time compared to the benchmarks.

show abstract

HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing Enabled Embedding of n-gram Statistics

Alonso

Shridhar

Kleyko

et al. 2021

View full text Add to dashboard Cite

Scaling Within Document Coreference to Long Texts

Thirukovalluru¹,

Monath²,

Shridhar³

et al. 2021

View full text Add to dashboard Cite

State of the art end-to-end coreference resolution models use expensive span representations and antecedent prediction mechanisms. These approaches are expensive both in terms of their memory requirements as well as compute time, and are particularly ill-suited for long documents. In this paper, we propose an approximation to end-to-end models which scales gracefully to documents of any length. Replacing span representations with token representations, we reduce the time/memory complexity via token windows and nearest neighbor sparsification methods for more efficient antecedent prediction. We show our approach's resulting reduction of training and inference time compared to state-of-the-art methods with only a minimal loss in accuracy.

show abstract

Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages

Jain¹,

Deshpande²,

Shridhar³

et al. 2020

Preprint

View full text Add to dashboard Cite

Language models based on the Transformer architecture [1] have achieved state-ofthe-art performance on a wide range of natural language processing (NLP) tasks such as text classification, question-answering, and token classification. However, this performance is usually tested and reported on high-resource languages, like English, French, Spanish, and German. Indian languages, on the other hand, are underrepresented in such benchmarks. Despite some Indian languages being included in training multilingual Transformer models, they have not been the primary focus of such work. In order to evaluate the performance on Indian languages specifically, we analyze these language models through extensive experiments on multiple downstream tasks in Hindi, Bengali, and Telugu language. Here, we compare the efficacy of fine-tuning model parameters of pre-trained models against that of training a language model from scratch. Moreover, we empirically argue against the strict dependency between the dataset size and model performance, but rather encourage task-specific model and method selection. We achieve state-of-the-art performance on Hindi and Bengali languages for text classification task. Finally, we present effective strategies for handling the modeling of Indian languages and we release our model checkpoints for the community : https://huggingface.co/neuralspace-reverie. * Equal contribution Preprint. Under review.

show abstract

Distilling Multi-Step Reasoning Capabilities of Large Language Models into Smaller Models via Semantic Decompositions

Shridhar¹,

Stolfo²,

Sachan³

2022

Preprint

View full text Add to dashboard Cite

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Miladinović¹,

Shridhar²,

Jain³

et al. 2022

Preprint

View full text Add to dashboard Cite

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Shridhar¹,

Jakub²,

El‐Assady³

et al. 2022

View full text Add to dashboard Cite

Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers. In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning. On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.https://github.com/eth-nlped/ scaffolding-generation

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kumar Shridhar

Subword Semantic Hashing for Intent Classification on Small Datasets

End to End Binarized Neural Networks for Text Classification

HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing Enabled Embedding of n-gram Statistics

Scaling Within Document Coreference to Long Texts

Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages

Distilling Multi-Step Reasoning Capabilities of Large Language Models into Smaller Models via Semantic Decompositions

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Contact Info

Product

Resources

About