Recent simulation studies indicate that there are occasions when examinees can use judgments of relative item difficulty to obtain positively biased proficiency estimates on computerized adaptive tests (CATs) that permit item review and answer change. Our purpose in the study reported here was to evaluate examinees' success in using these strategies while taking CATs in a live testing setting. We taught examinees two item difficulty judgment strategies designed to increase proficiency estimates. Examinees who were taught each strategy and examinees who were taught neither strategy were assigned at random to complete vocabulary CATs under conditions in which review was allowed after completing all items and when review was allowed only within successive blocks of items. We found that proficiency estimate changes following review were significantly higher in the regular review conditions than in the strategy conditions. Failure to obtain systematically higher scores in the strategy conditions was due in large part to errors examinees made in judging the relative difficulty of CAT items.
Recent studies have shown that restricting review and answer change opportunities on computerized adaptive tests (CATs) to items within successive blocks reduces time spent in review, satisfies most examinees' desires for review, and controls against distortion in proficiency estimates resulting from intentional incorrect answering of items prior to review. However, restricting review opportunities on CATs may not prevent examinees from artificially raising proficiency estimates by using judgments of item difficulty to signal when to change previous answers. We evaluated six strategies for using item difficulty judgments to change answers on CATs and compared the results to those from examinees reviewing and changing answers in the usual manner. The strategy conditions varied in terms of when examinees were prompted to consider changing answers and in the information provided about the consistency of the item selection algorithm. We found that examinees fared best on average when they reviewed and changed answers in the usual manner. The best gaming strategy was one in which the examinees knew something about the consistency of the item selection algorithm and were prompted to change responses only when they were unsure about answer correctness and sure about their item difficulty judgments. However, even this strategy did not produce a mean gain in proficiency estimates.
Although interchangeability of results across computer and paper modes of administration is commonly assumed, recent meta-analyses and individual studies continue to reveal mean differences in scores for measures of socially desirable responding (SDR). Results from these studies have also failed to include new methods of scoring and crucial aspects of scaling, reliability, validity, and administration emphasized in professional standards for assessment that are essential in establishing equivalence. We addressed these shortcomings in a comprehensive, repeated measures investigation for 6 ways of scoring the Balanced Inventory of Desirable Responding (BIDR), one of the most frequently administered companion measures of SDR in research and practice. Results for many previously unexamined, standards-driven aspects of scaling, reliability, and validity strongly supported the interchangeability of scores across modes of administration. Computer questionnaires also took considerably less time to complete and were overwhelmingly favored by respondents in relation to physical characteristics of the measures, appraisals of the the assessment experience, and perceived quality of information obtained. Collectively, these results highlight the importance of following professional standards when constructing and administering computerized assessments and the evolution of computer technology in providing viable, effective, and accepted platforms for administering and scoring the BIDR in numerous ways.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.