Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models

Sarsa, Sami; Denny, Paul; Hellas, Arto; Leinonen, Juho

doi:10.1145/3501385.3543957

Cited by 180 publications

(57 citation statements)

References 62 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Large language models can not only support the assessment of student's solutions but also assist in the automatic generation of exercises. Using few-shot learning, [40] showed that the OpenAI Codex model is able to provide a variety of programming tasks together with the correct solution, automated tests to verify the student's solutions, and additional code explanations. With regard to testing factual knowledge in general, [41] proposed a framework to automatically generate question-answer pairs.…”

Section: Review Of Research Applying Large Languagementioning

confidence: 99%

ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education

Kasneci¹,

Seßler²,

Küchemann³

et al. 2023

Preprint

122

View full text Add to dashboard Cite

Large language models represent a significant advancement in the field of AI. The underlying technology is key to further innovations and, despite critical views and even bans within communities and regions, large language models are here to stay. This position paper presents the potential benefits and challenges of educational applications of large language models, from student and teacher perspectives. We briefly discuss the current state of large language models and their applications. We then highlight how these models can be used to create educational content, improve student engagement and interaction, and personalize learning experiences. With regard to challenges, we argue that large language models in education require teachers and learners to develop sets of competencies and literacies necessary to both understand the technology as well as their limitations and unexpected brittleness of such systems. In addition, a clear strategy within educational systems and a clear pedagogical approach with a strong focus on critical thinking and strategies for fact checking are required to integrate and take full advantage of large language models in learning settings and teaching curricula. Other challenges such as the potential bias in the output, the need for continuous human oversight, and the potential for misuse are not unique to the application of AI in education. But we believe that, if handled sensibly, these challenges can offer insights and opportunities in education scenarios to acquaint students early on with potential societal biases, criticalities, and risks of AI applications. We conclude with recommendations for how to address these challenges and ensure that such models are used in a responsible and ethical manner in education.

show abstract

Section: Review Of Research Applying Large Languagementioning

confidence: 99%

ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education

Kasneci¹,

Seßler²,

Küchemann³

et al. 2023

Preprint

122

View full text Add to dashboard Cite

show abstract

“…Figure 3 shows an example where the overall feedback is bad quality and successfully rejected, though parts of the generated explanation are correct; this could potentially be useful for tutors in a human-in-the-loop approach. 4 When comparing PyFiXV P≥70 with any other technique in Figure 6a, the results are significantly different w.r.t. χ 2 tests [41] (p ≤ 0.0001); here, we use contingency tables with two rows (techniques) and four columns (240 data points mapped to four possible precision/coverage outcomes).…”

Section: Resultsmentioning

confidence: 91%

“…students in a large introductory programming course [3]. Subsequently, recent works have shown promising results in using Codex on various programming education scenarios, including generating new programming assignments [4], providing code explanations [5], and enhancing programming-error-messages [6].…”

Section: Introductionmentioning

confidence: 99%

Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models

Phung¹,

Cambronero²,

Gulwani³

et al. 2023

Preprint

View full text Add to dashboard Cite

Large language models trained on code (LLMCs), such as Codex, hold great promise in enhancing programming education by automatically generating feedback for students. We investigate using LLMCs to generate feedback for fixing syntax errors in Python programs, a key scenario in introductory programming. More concretely, given a student's buggy program, our goal is to generate feedback comprising a fixed program along with a natural language explanation describing the errors/fixes, inspired by how a human tutor would give feedback. While using LLMCs is promising, the critical challenge is to ensure high precision in the generated feedback, which is imperative before deploying such technology in classrooms. The main research question we study is: Can we develop LLMCs-based feedback generation techniques with a tunable precision parameter, giving educators quality control over the feedback that students receive? To this end, we introduce PyFiXV, our technique to generate high-precision feedback powered by Codex. The key idea behind PyFiXV is to use a novel run-time validation mechanism to decide whether the generated feedback is suitable for sharing with the student; notably, this validation mechanism also provides a precision knob to educators. We perform an extensive evaluation using two real-world datasets of Python programs with syntax errors and show the efficacy of PyFiXV in generating high-precision feedback.

show abstract

“…Similarly, BERT-generated doctor-patient dialogues were also found to be indistinguishable from actual doctor-patient dialogues, which can be used to create virtual standard patients for medical students' diagnosis practice training [57]. Additionally, for introductory programming courses, the state-ofthe-art LLMs, Codex, could generate sensible and novel exercises for students along with an appropriate sample solution (around three out of four times) and accurate code explanation (67% accuracy) [45].…”

Section: Practical Challenges -Rq2mentioning

confidence: 96%

Practical and Ethical Challenges of Large Language Models in Education: A Systematic Literature Review

Yan¹,

Sha²,

Zhao³

et al. 2023

Preprint

View full text Add to dashboard Cite

Educational technology innovations that have been developed based on large language models (LLMs) have shown the potential to automate the laborious process of generating and analysing textual content. While various innovations have been developed to automate a range of educational tasks (e.g., question generation, feedback provision, and essay grading), there are concerns regarding the practicality and ethicality of these innovations. Such concerns may hinder future research and the adoption of LLMs-based innovations in authentic educational contexts. To address this, we conducted a systematic literature review of 118 peer-reviewed papers published since 2017 to pinpoint the current state of research on using LLMs to automate and support educational tasks. The practical and ethical challenges of LLMs-based innovations were also identified by assessing their technological readiness, model performance, replicability, system transparency, privacy, equality, and beneficence. The findings were summarised into three recommendations for future studies, including updating existing innovations with state-of-the-art models (e.g., GPT-3), embracing the initiative of open-sourcing models/systems, and adopting a human-centred approach throughout the developmental process. These recommendations could support future research to develop practical and ethical innovations for supporting diverse educational tasks and benefiting students, teachers, and institutions.

show abstract

Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models

Cited by 180 publications

References 62 publications

ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education

ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education

Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models

Practical and Ethical Challenges of Large Language Models in Education: A Systematic Literature Review

Contact Info

Product

Resources

About