Expected a posteriori (EAP) estimation of ability, based on numerical evaluation of the mean and variance of the posterior distribution, is shown to have unusually good properties for computerized adaptive testing. The calculations are not complex, precede noniteratively by simple summation of log likelihoods as items are added, and require only values of the response function obtainable from precalculated tables at a limited number of quadra-ture points. Simulation studies are reported showing the near equivalence of the posterior standard deviation and the standard error of measurement. When the adaptive testings terminate at a fixed posterior standard deviation criterion of .90 or better, the regression of the EAP estimator on true ability is virtually linear with slope equal to the reliability, and the measurement error homogeneous, in the range & p l u s m n ; 2.5 standard deviations. With the increasing availability of inexpensive ~i~~®~®mp~t~~°s9 adaptive testing of cognitive abilities is fast becoming a practical reality. Many, perhaps most, applications of mental testing will soon benefit from the flexibility and efficiency of computerized adaptive testing. The requisite statistical theory, including realistic item response models (Samejima, 1981) and rigorous methods of item parameter estimation (Bock & Aitkin, 1981; Reiser, 1982; Thissen, 1982), is now available. Production
The multiple‐matrix item sampling designs that provide information about population characteristics most efficiently administer too few responses to students to estimate their proficiencies individually. Marginal estimation procedures, which estimate population characteristics directly from item responses, must be employed to realize the benefits of such a sampling design. Numerical approximations of the appropriate marginal estimation procedures for a broad variety of analyses can be obtained by constructing, from the results of a comprehensive extensive marginal solution, files of plausible values of student proficiencies. This article develops the concepts behind plausible values in a simplified setting, sketches their use in the National Assessment of Educational Progress (NAEP), and illustrates the approach with data from the Scholastic Aptitude Test (SA T).
Evidence‐centered assessment design (ECD) provides language, concepts, and knowledge representations for designing and delivering educational assessments, all organized around the evidentiary argument an assessment is meant to embody. This article describes ECD in terms of layers for analyzing domains, laying out arguments, creating schemas for operational elements such as tasks and measurement models, implementing the assessment, and carrying out the operational processes. We argue that this framework helps designers take advantage of developments from measurement, technology, cognitive psychology, and learning in the domains. Examples of ECD tools and applications are drawn from the Principled Assessment Design for Inquiry (PADI) project. Attention is given to implications for large‐scale tests such as state accountability measures, with a special eye for computer‐based simulation tasks.
This article describes a Bayesian framework for estimation in item response models, with two‐stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the EM algorithm is presented. Details are given for implementation under one‐, two‐, and three‐parameter logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.