2017
DOI: 10.3389/feduc.2017.00024
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the Quality of Higher Education Instructor-Constructed Multiple-Choice Tests: Impact on Student Grades

Abstract: Multiple-choice questions (MCQs) are commonly used in higher education assessment tasks because they can be easily and accurately scored, while giving good coverage of instructional content in a short time. However, studies that have evaluated the quality of MCQs used in higher education assessments have found many flawed items, resulting in misleading insights about student performance and contaminating important decisions. Thus, MCQs need to be evaluated statistically to ensure high-quality items are used as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
22
0
3

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(27 citation statements)
references
References 45 publications
(62 reference statements)
1
22
0
3
Order By: Relevance
“…Classical Test Theory (CTT) proposes that multiple choice questions on any given examination should have a range of difficulty, with item difficulty determined by the proportion of test candidates answering the item correctly (called the P-value)[1] [17]. An exam question with a P-value between P > 0.2 and P < 0.8 is considered to be an acceptable test item, with values P < 0.2 being too difficult and > 0.8 being too easy [1]. Oermann and Gaberson [18] suggest the desired P-value for most tests should be in the 0.3 to 0.7 range.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Classical Test Theory (CTT) proposes that multiple choice questions on any given examination should have a range of difficulty, with item difficulty determined by the proportion of test candidates answering the item correctly (called the P-value)[1] [17]. An exam question with a P-value between P > 0.2 and P < 0.8 is considered to be an acceptable test item, with values P < 0.2 being too difficult and > 0.8 being too easy [1]. Oermann and Gaberson [18] suggest the desired P-value for most tests should be in the 0.3 to 0.7 range.…”
Section: Discussionmentioning
confidence: 99%
“…The use of multiple choice questions on examinations is a common method of assessment in the health disciplines [1][2] [3][4] [5]. This format of questioning is frequently used as it can effectively and efficiently assess large numbers of students, be administered in a relatively short time period [1] [2], cover a broad range of subject matter, and be easily and objectively scored [2] [6]. In nursing education, use of multiple choice questions on examinations is often used in combination with other methods of evaluating student performance [7] [8].…”
Section: Introductionmentioning
confidence: 99%
“…Oleh karena itu dalam melakukan penilaian sangat dibutuhkan alat atau instrumen yang valid dan mampu mengukur kemampuan siswa secara objektif [1], [2]. Diantara berbagai macam alat penilaian yang bisa digunakan, soal pilihan ganda menjadi alat ukur yang paling banyak digunakan untuk penilaian siswa [3], [4]. Hal tersebut disebabkan oleh luasnya materi yang dapat dicakup dalam soal pilihan ganda dengan waktu penilaian yang relatif singkat.…”
Section: Pendahuluanunclassified
“…Selanjutnya Brown & Abdulnabi (2017) melalui hasil penelitiannya menunjukkan bahwa penggunaan analisis item IRT memiliki dampak menguntungkan yang potensial pada keseluruhan nilai dan jumlah siswa yang lulus. Ini juga menunjukkan bahwa umpan balik sangat bermanfaat bagi guru sebagai acuan dalam penyusunan soal yang sesuai dengan karakteristik [4].…”
Section: Pendahuluanunclassified
“…Various studies in different course offerings of numerous educational institutions have been conducted for evaluating the quality of multiple-choice questions (MCQs) in examinations (Ramah et al 2020). Statistical approaches that commonly utilize test item parameters such as item difficulty, item discrimination, and distractor efficiency are regarded as sample dependent or based on the ability of the test takers or students (Brown and Abdulnabi 2017). Despite the presence of various literature on writing quality MCQs (Carriveau 2016;Bowkett and Walker 2018), there has been a paucity of studies that focused on the quality of MCQs based on how they were composed by instructors.…”
Section: Introductionmentioning
confidence: 99%