Findings of the Association for Computational Linguistics: ACL 2023 2023
DOI: 10.18653/v1/2023.findings-acl.441
|View full text |Cite
|
Sign up to set email alerts
|

Distilling Reasoning Capabilities into Smaller Language Models

Abstract: Step-by-step reasoning approaches like chain of thought (CoT) have proved to be very effective in inducing reasoning capabilities in large language models. However, the success of the CoT approach is fundamentally tied to the model size, and billion parameter-scale models are often needed to get CoT to work. In this paper, we propose a knowledge distillation approach that leverages the step-by-step CoT reasoning capabilities of larger models and distills these abilities into smaller models.In this work, we pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(9 citation statements)
references
References 24 publications
0
0
0
Order By: Relevance
“…Specifically, each round of learning consists of three key steps: (1) The student LM undergoes an "exam" on the training set for collecting mistakes which are the wrong generated rationales. Existing works (Fu et al, 2023b;Ho et al, 2023;Shridhar et al, 2023;Magister et al, 2023) merely provide the sample question for the LLM to collect annotated rationales, neglecting the importance of the student's feedback. However, the student's feedback is crucial in knowledge distillation (Fu et al, 2021;Pham et al, 2021;Ren et al, 2023).…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Specifically, each round of learning consists of three key steps: (1) The student LM undergoes an "exam" on the training set for collecting mistakes which are the wrong generated rationales. Existing works (Fu et al, 2023b;Ho et al, 2023;Shridhar et al, 2023;Magister et al, 2023) merely provide the sample question for the LLM to collect annotated rationales, neglecting the importance of the student's feedback. However, the student's feedback is crucial in knowledge distillation (Fu et al, 2021;Pham et al, 2021;Ren et al, 2023).…”
Section: Methodsmentioning
confidence: 99%
“…However, these works aim to improve the instruction-following ability of smaller LMs, while the reasoning ability that we focus on is often overlooked. Some recent studies (Ho et al, 2023;Fu et al, 2023b;Shridhar et al, 2023) propose to employ LLMs to annotate rationales for training smaller student LMs towards reasoning, not considering the student's feedback to the teacher. In contrast, we exploit the potential of the black-box LLM as the teacher instead of the data annotator by proposing a multi-round learning paradigm.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Answering this question is essential whenever we require formal guarantees of the correctness of the outputs generated by an LM. For example, one might ask a language model to solve a mathematical problem based on a textual description (Shridhar et al, 2023) or ask it to find an optimal solution to an everyday optimization problem (Lin et al, 2021, Fig. 1).…”
Section: Introductionmentioning
confidence: 99%