Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Shridhar, Kumar; Jakub, Macina,; El‐Assady, Mennatallah; Sinha, Tanmay; Kapur, Manu; Sachan, Mrinmaya

doi:10.18653/v1/2022.emnlp-main.277

Cited by 8 publications

(6 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Decomposing Multi-Step Reasoning Tasks Solving multi-step reasoning tasks like MWPs has been a popular area of research for the last couple of years Hosseini et al, 2014;Roy et al, 2015;Amini et al, 2019;Zhang et al, 2020;Shridhar et al, 2022;Opedal et al, 2023). However, the majority of the modern approaches for these problems are shifting towards using large language models, often relying on approaches involving prompting or in-context learning (Cobbe et al, 2021;Kojima et al, 2022;Wei et al, 2022b;Chowdhery et al, 2022;Lewkowycz et al, 2022;Srivastava et al, 2022).…”

Section: Related Workmentioning

confidence: 99%

“…We found that the fine-tuned GPT-2 predicted an incorrect number of subquestions for the majority of problems (see Table 4, first row). Thus, following previous work on subquestion generation (Shridhar et al, 2022), we introduced a guidance mechanism that conditions the generation of subquestions for a problem P on the equations describing the intermediate solutions of P . This strategy improved the quality of the generated questions for all three metrics considered (Table 4, second row).…”

Section: Ablation Studiesmentioning

confidence: 99%

See 1 more Smart Citation

Distilling Reasoning Capabilities into Smaller Language Models

Shridhar¹,

Stolfo²,

Sachan³

2023

Findings of the Association for Computational Linguistics: ACL 2023

View full text Add to dashboard Cite

Step-by-step reasoning approaches like chain of thought (CoT) have proved to be very effective in inducing reasoning capabilities in large language models. However, the success of the CoT approach is fundamentally tied to the model size, and billion parameter-scale models are often needed to get CoT to work. In this paper, we propose a knowledge distillation approach that leverages the step-by-step CoT reasoning capabilities of larger models and distills these abilities into smaller models.In this work, we propose an alternative reasoning scheme, SOCRATIC COT that learns a decomposition of the original problem into a sequence of subproblems and uses it to guide the intermediate reasoning steps. We use SO-CRATIC COT to train a combination of two small distilled models: a problem decomposer and a subproblem solver. In practice, given a new problem, the two distilled models work in sync to decompose and solve complex problems. On multiple reasoning datasets (GSM8K, StrategyQA, and SVAMP), our proposed distillation strategies boost the performance of smaller models over 70% compared to the baselines. Finally, we investigate when SOCRATIC COT is an effective alternative to CoT, demonstrating cases where a much smaller model (GPT-2 large) can outperform a 10X larger model (GPT-3 6B). Our code is available here.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Ablation Studiesmentioning

confidence: 99%

Distilling Reasoning Capabilities into Smaller Language Models

Shridhar¹,

Stolfo²,

Sachan³

2023

Findings of the Association for Computational Linguistics: ACL 2023

View full text Add to dashboard Cite

show abstract

“…With the emergence of the SQuAD dataset (Rajpurkar et al, 2016), context-dependent QG gained momentum (Du et al, 2017;Yuan et al, 2017;Subramanian et al, 2018;Puri et al, 2020). This extended to complex tasks like generating unanswerable questions (Choi et al, 2018;Zhu et al, 2019;Reddy et al, 2019) and multi-hop reasoning (Pan et al, 2020(Pan et al, , 2021Shridhar et al, 2022). Our work, focusing on generating code tracing questions in CS Education domain, addresses unique challenges around code, natural language, and pedagogical comprehension, inadequately covered by previous methods due to a lack of specialized datasets.…”

Section: Related Workmentioning

confidence: 99%

Exploring the Potential of Large Language Models in Generating Code-Tracing Questions for Introductory Programming Courses

Fan,

Zhang,

Paquette

et al. 2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

In this paper, we explore the application of large language models (LLMs) for generating code-tracing questions in introductory programming courses. We designed targeted prompts for GPT4, guiding it to generate code-tracing questions based on code snippets and descriptions. We established a set of human evaluation metrics to assess the quality of questions produced by the model compared to those created by human experts. Our analysis provides insights into the capabilities and potential of LLMs in generating diverse code-tracing questions. Additionally, we present a unique dataset of human and LLM-generated tracing questions, serving as a valuable resource for both the education and NLP research communities. This work contributes to the ongoing dialogue on the potential uses of LLMs in educational settings 1 .

show abstract

“…More recently, (Patel et al, 2022) proposes an alternative approach to enhance the performance of LLMs by decomposing challenging questions into simpler sub-questions on various tasks. Notably, the efficacy of question decomposition has been demonstrated across a range of tasks and domains, including solving mathematical problems (Shridhar et al, 2022), medical question answering (Roberts et al, 2014), and factual correction (Huang et al, 2023).…”

Section: Related Workmentioning

confidence: 99%

The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models

Qi,

Xu,

Shen

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Chain-of-Thought (CoT) prompting enables large language models to solve complex reasoning problems by generating intermediate steps. However, confined by its inherent singlepass and sequential generation process, CoT heavily relies on the initial decisions, causing errors in early steps to accumulate and impact the final answers. In contrast, humans adopt recursive thinking when tackling complex reasoning problems, i.e., iteratively breaking the original problem into approachable subproblems and aggregating their answers to resolve the original one. Inspired by the human cognitive process, we propose SOCRATIC QUESTIONING, a divide-and-conquer style algorithm that mimics the recursive thinking process. Specifically, SOCRATIC QUESTIONING leverages large language models to raise and answer sub-questions until collecting enough information to tackle the original question. Unlike CoT, SOCRATIC QUESTIONING explicitly navigates the thinking space, stimulates effective recursive thinking, and is more robust towards errors in the thinking process. Extensive experiments on several complex reasoning tasks, including MMLU, MATH, LogiQA, and visual question-answering demonstrate significant performance improvements over the stateof-the-art prompting methods, such as CoT, and Tree-of-Thought. The qualitative analysis clearly shows that the intermediate reasoning steps elicited by SOCRATIC QUESTIONING are similar to humans' recursively thinking process of complex reasoning problems 12 .

show abstract

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Cited by 8 publications

References 41 publications

Distilling Reasoning Capabilities into Smaller Language Models

Distilling Reasoning Capabilities into Smaller Language Models

Exploring the Potential of Large Language Models in Generating Code-Tracing Questions for Introductory Programming Courses

The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models

Contact Info

Product

Resources

About