Medical Education 2012: 46: 757–765 Context Many tests of medical knowledge, from the undergraduate level to the level of certification and licensure, contain multiple‐choice items. Although these are efficient in measuring examinees’ knowledge and skills across diverse content areas, multiple‐choice items are time‐consuming and expensive to create. Changes in student assessment brought about by new forms of computer‐based testing have created the demand for large numbers of multiple‐choice items. Our current approaches to item development cannot meet this demand. Methods We present a methodology for developing multiple‐choice items based on automatic item generation (AIG) concepts and procedures. We describe a three‐stage approach to AIG and we illustrate this approach by generating multiple‐choice items for a medical licensure test in the content area of surgery. Results To generate multiple‐choice items, our method requires a three‐stage process. Firstly, a cognitive model is created by content specialists. Secondly, item models are developed using the content from the cognitive model. Thirdly, items are generated from the item models using computer software. Using this methodology, we generated 1248 multiple‐choice items from one item model. Conclusions Automatic item generation is a process that involves using models to generate items using computer technology. With our method, content specialists identify and structure the content for the test items, and computer technology systematically combines the content to generate new test items. By combining these outcomes, items can be generated automatically.
We discuss the new challenges and directions facing the use of big data and artificial intelligence (AI) in education research, policy-making, and industry. In recent years, applications of big data and AI in education have made significant headways. This highlights a novel trend in leading-edge educational research. The convenience and embeddedness of data collection within educational technologies, paired with computational techniques have made the analyses of big data a reality. We are moving beyond proof-of-concept demonstrations and applications of techniques, and are beginning to see substantial adoption in many areas of education. The key research trends in the domains of big data and AI are associated with assessment, individualized learning, and precision education. Model-driven data analytics approaches will grow quickly to guide the development, interpretation, and validation of the algorithms. However, conclusions from educational analytics should, of course, be applied with caution. At the education policy level, the government should be devoted to supporting lifelong learning, offering teacher education programs, and protecting personal data. With regard to the education industry, reciprocal and mutually beneficial relationships should be developed in order to enhance academia-industry collaboration. Furthermore, it is important to make sure that technologies are guided by relevant theoretical frameworks and are empirically tested. Lastly, in this paper we advocate an in-depth dialog between supporters of "cold" technology and "warm" humanity so that it can lead to greater understanding among teachers and students about how technology, and specifically, the big data explosion and AI revolution can bring new opportunities (and challenges) that can be best leveraged for pedagogical practices and learning.
PurposeTo investigate the contributions of psychological needs (autonomy, competence, and relatedness) and coping strategies (self-compassion, leisure-time exercise, and achievement goals) to engagement and exhaustion in Canadian medical students. MethodsThis was an observational study. Two hundred undergraduate medical students participated in the study: 60.4% were female, 95.4% were 20–29 years old, and 23.0% were in year 1, 30.0% in year 2, 21.0% in year 3, and 26.0% in year 4. Students completed an online survey with measures of engagement and exhaustion from the Oldenburg Burnout Inventory–student version; autonomy, competence, and relatedness from the Basic Psychological Needs Scale; self-compassion from the Self-Compassion Scale–short form; leisure-time exercise from the Godin Leisure-Time Exercise Questionnaire; and mastery approach, mastery avoidance, performance approach, and performance avoidance goals from the Achievement Goals Instrument. Descriptive and inferential analyses were performed.ResultsThe need for competence was the strongest predictor of student engagement (β= 0.35, P= 0.000) and exhaustion (β= −0.33, P= 0.000). Students who endorsed mastery approach goals (β= 0.21, P= 0.005) and who were more self-compassionate (β= 0.13, P= 0.050) reported greater engagement with their medical studies. Students who were less self-compassionate (β= −0.32, P= 0.000), who exercised less (β= −0.12, P= 0.044), and who endorsed mastery avoidance goals (β= 0.22, P= 0.003) reported greater exhaustion from their studies. Students’ gender (β= 0.18, P= 0.005) and year in medical school (β= −0.18, P= 0.004) were related to engagement, but not to exhaustion. ConclusionSupporting students’ need for competence and raising students’ awareness of self-compassion, leisure-time exercise, and mastery approach goals may help protect students from burnout-related exhaustion and enhance their engagement with their medical school studies.
Automated essay scoring systems yield scores that consistently agree with those of human raters at a level as high, if not higher, as the level of agreement among human raters themselves. The system offers medical educators many benefits for scoring constructed-response tasks, such as improving the consistency of scoring, reducing the time required for scoring and reporting, minimising the costs of scoring, and providing students with immediate feedback on constructed-response tasks.
OBJECTIVES Computerised assessment raises formidable challenges because it requires large numbers of test items. Automatic item generation (AIG) can help address this test development problem because it yields large numbers of new items both quickly and efficiently. To date, however, the quality of the items produced using a generative approach has not been evaluated. The purpose of this study was to determine whether automatic processes yield items that meet standards of quality that are appropriate for medical testing. Quality was evaluated firstly by subjecting items created using both AIG and traditional processes to rating by a four-member expert medical panel using indicators of multiplechoice item quality, and secondly by asking the panellists to identify which items were developed using AIG in a blind review.METHODS Fifteen items from the domain of therapeutics were created in three different experimental test development conditions. The first 15 items were created by content specialists using traditional test development methods (Group 1 Traditional). The second 15 items were created by the same content specialists using AIG methods (Group 1 AIG). The third 15 items were created by a new group of content specialists using traditional methods (Group 2 Traditional). These 45 items were then evaluated for quality by a four-member panel of medical experts and were subsequently categorised as either Traditional or AIG items.RESULTS Three outcomes were reported: (i) the items produced using traditional and AIG processes were comparable on seven of eight indicators of multiple-choice item quality; (ii) AIG items can be differentiated from Traditional items by the quality of their distractors, and (iii) the overall predictive accuracy of the four expert medical panellists was 42%.CONCLUSIONS Items generated by AIG methods are, for the most part, equivalent to traditionally developed items from the perspective of expert medical reviewers. While the AIG method produced comparatively fewer plausible distractors than the traditional method, medical experts cannot consistently distinguish AIG items from traditionally developed items in a blind review.
Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content‐specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer technology. The purpose of this module is to describe and illustrate a template‐based method for generating test items. We outline a three‐step approach where test development specialists first create an item model. An item model is like a mould or rendering that highlights the features in an assessment task that must be manipulated to produce new items. Next, the content used for item generation is identified and structured. Finally, features in the item model are systematically manipulated with computer‐based algorithms to generate new items. Using this template‐based approach, hundreds or even thousands of new items can be generated with a single item model.
With the recent interest in competency-based education, educators are being challenged to develop more assessment opportunities. As such, there is increased demand for exam content development, which can be a very labor-intense process. An innovative solution to this challenge has been the use of automatic item generation (AIG) to develop multiple-choice questions (MCQs). In AIG, computer technology is used to generate test items from cognitive models (i.e. representations of the knowledge and skills that are required to solve a problem). The main advantage yielded by AIG is the efficiency in generating items. Although technology for AIG relies on a linear programming approach, the same principles can also be used to improve traditional committee-based processes used in the development of MCQs. Using this approach, content experts deconstruct their clinical reasoning process to develop a cognitive model which, in turn, is used to create MCQs. This approach is appealing because it: (1) is efficient; (2) has been shown to produce items with psychometric properties comparable to those generated using a traditional approach; and (3) can be used to assess higher order skills (i.e. application of knowledge). The purpose of this article is to provide a novel framework for the development of high-quality MCQs using cognitive models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.