Detection methods for item preknowledge are often evaluated in simulation studies where models are used to generate the data. To ensure the reliability of such methods, it is crucial that these models are able to accurately represent situations that are encountered in practice. The purpose of this article is to provide a critical analysis of common models that have been used to simulate preknowledge. Both response accuracy (RA) and response time (RT) models are considered. The justifications and supporting evidence for each model are evaluated using three real data sets, and the impact of generating model on detection power is examined in two simulation studies.
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the lz$l_z$ and lz∗$l_z^*$ person‐fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through detailed simulations, we show that the new statistics are more powerful than existing statistics in detecting several types of aberrant behavior, and that they are able to control the Type I error rate in instances where the model does not exactly fit the data. A real data example is also provided to demonstrate the utility of the new statistics in an operational setting.
To evaluate preknowledge detection methods, researchers often conduct simulation studies in which they use models to generate the data. In this article, we propose two new models to represent item preknowledge. Contrary to existing models, we allow the impact of preknowledge to vary across persons and items in order to better represent situations that are encountered in practice. We use three real data sets to evaluate the fit of the new models with respect to two types of preknowledge: items only, and items and the correct answer key. Results show that the two new models provide the best fit compared to several other existing preknowledge models. Furthermore, model parameter estimates were found to vary substantially depending on the type of preknowledge being considered, indicating that answer key disclosure has a profound impact on testing behavior.
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item scores and distractors to simultaneously detect CI and EWP. The false positive rate and true positive rate are evaluated for both items and examinees using detailed simulations. A real data example is also provided using data from an information technology certification exam.
Test speededness refers to a situation in which examinee performance is inadvertently affected by the time limit of the test. Because speededness has the potential to severely bias both person and item parameter estimates, it is crucial that speeded examinees are detected. In this article, we develop a change-point analysis (CPA) procedure for detecting test speededness. Our procedure distinguishes itself from existing CPA procedures by using information from both item scores and distractors. Using detailed simulations, we show that under most conditions, the new CPA procedure improves the detection of speeded examinees and produces more accurate change-point estimates. It therefore seems there is a considerable amount of information to be gained from the item distractors, which, quite notably are available in all multiple-choice data. A real data example is also provided.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.