Prakhar Gupta scite author profile

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

show abstract

Learning Word Vectors for 157 Languages

Grave¹,

Bojanowski²,

Gupta³

et al. 2018

Preprint

View full text Add to dashboard Cite

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. A key ingredient to the successful application of these representations is to train them on very large corpora, and use these pre-trained models in downstream tasks. In this paper, we describe how we trained such high quality word representations for 157 languages. We used two sources of data to train these models: the free online encyclopedia Wikipedia and data from the common crawl project. We also introduce three new word analogy datasets to evaluate these word vectors, for French, Hindi and Polish. Finally, we evaluate our pre-trained word vectors on 10 languages for which evaluation datasets exists, showing very strong performance compared to previous models.

show abstract

A Helical Cauchy-Born Rule for Special Cosserat Rod Modeling of Nano and Continuum Rods

Kumar

Gupta

2015

J Elast

View full text Add to dashboard Cite

Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References

Gupta¹,

Mehri²,

Zhao³

et al. 2019

View full text Add to dashboard Cite

The aim of this paper is to mitigate the shortcomings of automatic evaluation of open-domain dialog systems through multireference evaluation. Existing metrics have been shown to correlate poorly with human judgement, particularly in open-domain dialog. One alternative is to collect human annotations for evaluation, which can be expensive and time consuming. To demonstrate the effectiveness of multi-reference evaluation, we augment the test set of DailyDialog with multiple references. A series of experiments show that the use of multiple references results in improved correlation between several automatic metrics and human judgement for both the quality and the diversity of system output.

show abstract

Deep Convolutional Neural Network with Transfer Learning for Detecting Pneumonia on Chest X-Rays

Chhikara

Singh

Gupta

et al. 2019

View full text Add to dashboard Cite

Modeling flexoelectricity in soft dielectrics at finite deformation

Codony

Gupta

Marco

et al. 2021

Journal of the Mechanics and Physics of Solids

View full text Add to dashboard Cite

Better Word Embeddings by Disentangling Contextual n-Gram Information

Gupta

Pagliardini

Jäggi

2019

View full text Add to dashboard Cite

Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings. We claim that training word embeddings along with higher n-gram embeddings helps in the removal of the contextual information from the unigrams, resulting in better stand-alone word embeddings. We empirically show the validity of our hypothesis by outperforming other competing word representation models by a significant margin on a wide variety of tasks. We make our models publicly available.

show abstract

Obtaining Better Static Word Embeddings Using Contextual Embedding Models

Gupta¹,

Jäggi²

2021

View full text Add to dashboard Cite

The advent of contextual word embeddingsrepresentations of words which incorporate semantic and syntactic information from their context-has led to tremendous improvements on a wide variety of NLP tasks. However, recent contextual models have prohibitively high computational cost in many use-cases and are often hard to interpret. In this work, we demonstrate that our proposed distillation method, which is a simple extension of CBOW-based training, allows to significantly improve computational efficiency of NLP applications, while outperforming the quality of existing static embeddings trained from scratch as well as those distilled from previously proposed methods. As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings via standard lexical evaluation tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Prakhar Gupta

Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features

Learning Word Vectors for 157 Languages

A Helical Cauchy-Born Rule for Special Cosserat Rod Modeling of Nano and Continuum Rods

Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References

Deep Convolutional Neural Network with Transfer Learning for Detecting Pneumonia on Chest X-Rays

Modeling flexoelectricity in soft dielectrics at finite deformation

Better Word Embeddings by Disentangling Contextual n-Gram Information

Obtaining Better Static Word Embeddings Using Contextual Embedding Models

Contact Info

Product

Resources

About