Jesse Dodge scite author profile

Vector space word representations are learned from distributional information of words in large corpora. Although such statistics are semantically informative, they disregard the valuable information that is contained in semantic lexicons such as WordNet, FrameNet, and the Paraphrase Database. This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes no assumptions about how the input vectors were constructed. Evaluated on a battery of standard lexical semantic evaluation tasks in several languages, we obtain substantial improvements starting with a variety of word vector models. Our refinement method outperforms prior techniques for incorporating semantic lexicons into word vector training algorithms.

show abstract

Key-Value Memory Networks for Directly Reading Documents

Miller

et al. 2016

View full text Add to dashboard Cite

Directly reading documents and being able to answer questions from them is an unsolved challenge. To avoid its inherent difficulty, question answering (QA) has been directed towards using Knowledge Bases (KBs) instead, which has proven effective. Unfortunately KBs often suffer from being too restrictive, as the schema cannot support certain types of answers, and too sparse, e.g. Wikipedia contains much more information than Freebase. In this work we introduce a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation. To compare using KBs, information extraction or Wikipedia documents directly in a single framework we construct an analysis tool, WIKIMOVIES, a QA dataset that contains raw text alongside a preprocessed KB, in the domain of movies. Our method reduces the gap between all three settings. It also achieves state-of-the-art results on the existing WIKIQA benchmark.

show abstract

Green AI

et al. 2020

View full text Add to dashboard Cite

show abstract

Show Your Work: Improved Reporting of Experimental Results

Dodge¹,

Gururangan²,

Card³

et al. 2019

154

146

View full text Add to dashboard Cite

Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e.g., accuracy) on held-out test data, compared to previous results. In this paper, we demonstrate that test-set performance scores alone are insufficient for drawing accurate conclusions about which model performs best. We argue for reporting additional details, especially performance on validation data obtained during model development. We present a novel technique for doing so: expected validation performance of the best-found model as a function of computation budget (i.e., the number of hyperparameter search trials or the overall training time). Using our approach, we find multiple recent model comparisons where authors would have reached a different conclusion if they had used more (or less) computation. Our approach also allows us to estimate the amount of computation required to obtain a given accuracy; applying it to several recently published results yields massive variation across papers, from hours to weeks. We conclude with a set of best practices for reporting experimental results which allow for robust future comparisons, and provide code to allow researchers to use our technique. 1

show abstract

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

Dodge¹,

Ilharco²,

Schwartz³

et al. 2020

Preprint

141

View full text Add to dashboard Cite

Understanding and predicting importance in images

et al. 2012

View full text Add to dashboard Cite

What do people care about in an image? To drive computational visual recognition toward more human-centric outputs, we need a better understanding of how people perceive and judge the importance of content in images. In this paper, we explore how a number of factors relate to human perception of importance. Proposed factors fall into 3 broad types: 1) factors related to composition, e.g. size, location, 2) factors related to semantics, e.g. category of object or scene, and 3) contextual factors related to the likelihood of attribute-object, or object-scene pairs. We explore these factors using what people describe as a proxy for importance. Finally, we build models to predict what will be described about an image given either known image content, or image content estimated automatically by recognition systems.

show abstract

Key-Value Memory Networks for Directly Reading Documents

Miller¹,

Fisch²,

Dodge³

et al. 2016

Preprint

105

View full text Add to dashboard Cite

Green AI

Schwartz¹,

Dodge²,

Smith³

et al. 2019

Preprint

106

View full text Add to dashboard Cite

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jesse Dodge

Retrofitting Word Vectors to Semantic Lexicons

Key-Value Memory Networks for Directly Reading Documents

Green AI

Show Your Work: Improved Reporting of Experimental Results

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

Understanding and predicting importance in images

Key-Value Memory Networks for Directly Reading Documents

Green AI

Contact Info

Product

Resources

About