Distilling Reasoning Capabilities into Smaller Language Models

Shridhar, Kumar; Stolfo, Alessandro; Sachan, Mrinmaya

doi:10.18653/v1/2023.findings-acl.441

Cited by 16 publications

(9 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Specifically, each round of learning consists of three key steps: (1) The student LM undergoes an "exam" on the training set for collecting mistakes which are the wrong generated rationales. Existing works (Fu et al, 2023b;Ho et al, 2023;Shridhar et al, 2023;Magister et al, 2023) merely provide the sample question for the LLM to collect annotated rationales, neglecting the importance of the student's feedback. However, the student's feedback is crucial in knowledge distillation (Fu et al, 2021;Pham et al, 2021;Ren et al, 2023).…”

Section: Methodsmentioning

confidence: 99%

“…However, these works aim to improve the instruction-following ability of smaller LMs, while the reasoning ability that we focus on is often overlooked. Some recent studies (Ho et al, 2023;Fu et al, 2023b;Shridhar et al, 2023) propose to employ LLMs to annotate rationales for training smaller student LMs towards reasoning, not considering the student's feedback to the teacher. In contrast, we exploit the potential of the black-box LLM as the teacher instead of the data annotator by proposing a multi-round learning paradigm.…”

Section: Related Workmentioning

confidence: 99%

“…4) Evaluating the correctness of generated rationale is mainly based on the final answer. Though most existing works (Zelikman et al, 2022;Ho et al, 2023;Fu et al, 2023b;Shridhar et al, 2023) in this field adopt this simple criterion, we call attention to develop more trustworthy criteria to evaluate the quality of rationales. Potential methods can be using GPT-4 (OpenAI, 2023) or a process reward model (Lightman et al, 2023) for automatic evaluation.…”

Section: Limitationsmentioning

confidence: 99%

“…Recently, teaching smaller LMs towards reasoning with the help of LLMs has gained increasing attention. Most of these works (Ho et al, 2023;Magister et al, 2023;Fu et al, 2023b;Shridhar et al, 2023) can be summarized in two main steps:…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Democratizing Reasoning Ability: Tailored Learning from Large Language Model

Wang,

Huang,

Liu

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Large language models (LLMs) exhibit impressive emergent abilities in natural language processing, but their democratization is hindered due to huge computation requirements and closed-source nature. Recent research on advancing open-source smaller LMs by distilling knowledge from black-box LLMs has obtained promising results in the instructionfollowing ability. However, the reasoning ability which is more challenging to foster, is relatively rarely explored. In this paper, we propose a tailored learning approach to distill such reasoning ability to smaller LMs to facilitate the democratization of the exclusive reasoning ability. In contrast to merely employing LLM as a data annotator, we exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm. This paradigm enables the student to expose its deficiencies to the black-box teacher who then can provide customized training data in return. Further, to exploit the reasoning potential of the smaller LM, we propose selfreflection learning to motivate the student to learn from self-made mistakes. The learning from self-reflection and LLM are all tailored to the student's learning status, thanks to the seamless integration with the multi-round learning paradigm. Comprehensive experiments and analysis on mathematical and commonsense reasoning tasks demonstrate the effectiveness of our method. The code will be available at https://github.com/Raibows/Learn-to-Reason.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Limitationsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Democratizing Reasoning Ability: Tailored Learning from Large Language Model

Wang,

Huang,

Liu

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Answering this question is essential whenever we require formal guarantees of the correctness of the outputs generated by an LM. For example, one might ask a language model to solve a mathematical problem based on a textual description (Shridhar et al, 2023) or ask it to find an optimal solution to an everyday optimization problem (Lin et al, 2021, Fig. 1).…”

Section: Introductionmentioning

confidence: 99%

Recurrent Neural Language Models as Probabilistic Finite-state Automata

Svete,

Cotterell

2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Studying language models (LMs) in terms of well-understood formalisms allows us to precisely characterize their abilities and limitations. Previous work has investigated the representational capacity of recurrent neural network (RNN) LMs in terms of their capacity to recognize unweighted formal languages. However, LMs do not describe unweighted formal languages-rather, they define probability distributions over strings. In this work, we study what classes of such probability distributions RNN LMs can represent, which allows us to make more direct statements about their capabilities. We show that simple RNNs are equivalent to a subclass of probabilistic finitestate automata, and can thus model a strict subset of probability distributions expressible by finite-state models. Furthermore, we study the space complexity of representing finite-state LMs with RNNs. We show that, to represent an arbitrary deterministic finite-state LM with N states over an alphabet Σ, an RNN requires Ω pN |Σ|q neurons. These results present a first step towards characterizing the classes of distributions RNN LMs can represent and thus help us understand their capabilities and limitations. https://github.com/rycolab/ weighted-minsky

show abstract