Swaroop Mishra scite author profile

Swaroop Mishra

5Publications

203Citation Statements Received

64Citation Statements Given

How they've been cited

229

201

How they cite others

Affiliations

Arizona State University, Indian Institute of Technology Kanpur

Publications

Order By: Most citations

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Mishra¹,

Khashabi²,

Baral³

et al. 2022

View full text Add to dashboard Cite

Humans (e.g., crowdworkers) have a remarkable ability in solving different tasks, by simply reading textual instructions that define them and looking at a few examples. Despite the success of the conventional supervised learning on individual datasets, such models often struggle with generalization across tasks (e.g., a question-answering system cannot solve classification tasks). A long-standing challenge in AI is to build a model that learns a new task by understanding the humanreadable instructions that define it. To study this, we introduce NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their humanauthored instructions, and 193k task instances (input-output pairs). The instructions are obtained from crowdsourcing instructions used to create existing NLP datasets and mapped to a unified schema. Using this meta-dataset, we measure cross-task generalization by training models on seen tasks and measuring generalization to the remaining unseen ones. We adopt generative pre-trained language models to encode task-specific instructions along with input and generate task output. Our results indicate that models benefit from instructions when evaluated in terms of generalization to unseen tasks (19% better for models utilizing instructions). These models, however, are far behind an estimated performance upperbound, indicating significant room for more progress in this direction. 1

show abstract

Self-Instruct: Aligning Language Model with Self Generated Instructions

Wang¹,

Kordi²,

Mishra³

et al. 2022

Preprint

View full text Add to dashboard Cite

Reframing Instructional Prompts to GPTk’s Language

Mishra¹,

Khashabi²,

Baral³

et al. 2022

View full text Add to dashboard Cite

Reframing Instructional Prompts to GPTk's Language

Mishra¹,

Khashabi²,

Baral³

et al. 2021

Preprint

View full text Add to dashboard Cite

How can model designers turn task instructions into effective prompts for language models? Backed by extensive empirical analysis on GPT3, we observe important features for successful instructional prompts, and propose several reframing techniques for model designers to create such prompts. For example, a complex task can be decomposed into multiple simpler tasks (Fig. 1a). We experiment over 12 NLP tasks across 6 diverse categories (question generation, classification, etc.). Our results show that reframing improves few-shot learning performance by 14% while reducing sample complexity over existing few-shot baselines. The performance gains are particularly important on large language models, such as GPT3 where tuning models or prompts on large datasets is not feasible. Furthermore, we observe that such gains are not limited to GPT3; the reframed tasks remain superior over raw instructions across different model architectures, underscoring the cross-model generality of these guidelines. We hope these empirical-driven techniques will pave way for more effective ways to prompt LMs in future.

show abstract

Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks

Wang¹,

Mishra²,

Alipoormolabashi³

et al. 2022

Preprint

View full text Add to dashboard Cite

How can we measure the generalization of models to a variety of unseen tasks when provided with their language instructions? To facilitate progress in this goal, we introduce NATURAL-INSTRUCTIONS v2 , a benchmark of 1,600+ diverse language tasks and their expertwritten instructions. It covers 70+ distinct task types, such as tagging, in-filling, and rewriting. These tasks are collected with contributions of NLP practitioners in the community and through an iterative peer review process to ensure their quality. With this large and diverse collection of tasks, we are able to rigorously benchmark cross-task generalization of models-training on a subset of tasks and evaluating on the remaining unseen ones. For instance, we quantify generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances, and model sizes. Based on these insights, we introduce Tk-INSTRUCT, an encoder-decoder Transformer that is trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples) which outperforms existing larger models on our benchmark. We hope this benchmark facilitates future progress toward more general-purpose language understanding models. 1

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Swaroop Mishra

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Self-Instruct: Aligning Language Model with Self Generated Instructions

Reframing Instructional Prompts to GPTk’s Language

Reframing Instructional Prompts to GPTk's Language

Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks

Contact Info

Product

Resources

About