Hoyle, Alexander scite author profile

Studying the ways in which language is gendered has long been an area of interest in sociolinguistics. Studies have explored, for example, the speech of male and female characters in film and the language used to describe male and female politicians. In this paper, we aim not to merely study this phenomenon qualitatively, but instead to quantify the degree to which the language used to describe men and women is different and, moreover, different in a positive or negative way. To that end, we introduce a generative latent-variable model that jointly represents adjective (or verb) choice, with its sentiment, given the natural gender of a head (or dependent) noun. We find that there are significant differences between descriptions of male and female nouns and that these differences align with common gender stereotypes: Positive adjectives used to describe women are more often related to their bodies than adjectives used to describe men.

show abstract

Improving Neural Topic Models using Knowledge Distillation

Alexander¹,

Goel²,

Resnik³

2020

View full text Add to dashboard Cite

Topic models are often used to identify humaninterpretable topics to help make sense of large document collections. We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers. Our modular method can be straightforwardly applied with any neural topic model to improve topic quality, which we demonstrate using two models having disparate architectures, obtaining state-of-the-art topic coherence. We show that our adaptable framework not only improves performance in the aggregate over all estimated topics, as is commonly reported, but also in head-to-head comparisons of aligned topics. * Equal contribution. BoW BAT art chess gingerbread modernism painter picasso θ d • B Base neural topic model d Marcel Duchamp was a painter, sculptor, chess player, and writer whose work is associated with Cubism, Dada, and conceptual art. 7 qwone.com/˜jason/20Newsgroups 8 s3.amazonaws.com/research.metamind. io/wikitext/wikitext-103-v1.zip 9 ai.stanford.edu/˜amaas/data/sentiment

show abstract

Evaluation Examples are not Equally Informative: How should that change NLP Leaderboards?

Rodríguez¹,

Barrow²,

Alexander³

et al. 2021

View full text Add to dashboard Cite

Leaderboards are widely used in NLP and push the field forward. While leaderboards are a straightforward ranking of NLP models, this simplicity can mask nuances in evaluation items (examples) and subjects (NLP models). Rather than replace leaderboards, we advocate a re-imagining so that they better highlight if and where progress is made. Building on educational testing, we create a Bayesian leaderboard model where latent subject skill and latent item difficulty predict correct responses. Using this model, we analyze the ranking reliability of leaderboards. Afterwards, we show the model can guide what to annotate, identify annotation errors, detect overfitting, and identify informative examples. We conclude with recommendations for future benchmark tasks.

show abstract

Improving Neural Topic Models using Knowledge Distillation

Alexander¹,

Goel²,

Resnik³

2020

Preprint

View full text Add to dashboard Cite

show abstract

Promoting Graph Awareness in Linearized Graph-to-Text Generation

Alexander¹,

Marasovi²,

Smith³

2021

View full text Add to dashboard Cite

Generating text from structured inputs, such as meaning representations or RDF triples, has often involved the use of specialized graphencoding neural networks. However, recent applications of pretrained transformers to linearizations of graph inputs have yielded stateof-the-art generation results on graph-to-text tasks. Here, we explore the ability of these linearized models to encode local graph structures, in particular their invariance to the graph linearization strategy and their ability to reconstruct corrupted inputs. Our findings motivate solutions to enrich the quality of models' implicit graph encodings via scaffolding. Namely, we use graph-denoising objectives implemented in a multi-task text-to-text framework. We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hoyle, Alexander

Unsupervised Discovery of Gendered Language through Latent-Variable Modeling

Improving Neural Topic Models using Knowledge Distillation

Evaluation Examples are not Equally Informative: How should that change NLP Leaderboards?

Improving Neural Topic Models using Knowledge Distillation

Promoting Graph Awareness in Linearized Graph-to-Text Generation

Contact Info

Product

Resources

About