Item Review and the Rearrangement Procedure: Its process and its results

Papanastasiou, Elena C.

doi:10.1080/13803610500110521

Cited by 5 publications

(5 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(1) We only embedded salt items in fixed item locations, although we expect the salt items’ positions should be more dynamic in practice so that examinees are less likely to identify salt items; (2) we investigated the test length of 30 items although longer tests exist in practice, and we expect the performance of CATS would be better in longer tests because it would be harder to inflate scores; (3) we randomly selected salt items from the mini form, although in practice there can be considerations of the order of presenting test items (e.g., the sequence of content strands, the order of item difficulties, etc. ); (4) the test length was fixed in this study, although many adaptive tests are administered with stopping rules that may result in variable test lengths—the inclusion of salt items will likely increase the test length while maintaining the same level of accuracy as CAT; and (5) although only CATS provides review opportunities with no restriction, a comparison of CATS with restricted CAT (e.g., Han, ; Papanastasiou, ; Stocking, ) may shed some light on providing review opportunities to examinees taking CAT. Because multistage testing (MST) allows review within each module (i.e., a group of items in one stage), a comparison between CATS and MST may be interesting too.…”

Section: Discussionmentioning

confidence: 99%

“…Because examinees feel that they have little control over the testing environments where no review is allowed, they tend to have elevated test anxiety levels which can increase the error in the examinee's ability estimation on adaptive tests (Stocking, ; Wise, ). The research in the literature has shown that, if answer changes were allowed on a test, the final estimate of an examinee's ability could be more accurate because of a reduced anxiety level and the opportunity to fix mistakes (Olea, Revuelta, Ximénez, & Abad, ; Papanastasiou, ; Wise, ). Benjamin, Cavell, and Shallenberger () reviewed 33 studies on the effects of answer changing on test performance and found that (1) 80% of examinees changed at least one answer and (2) 68% of examinees experienced a score increase due to answer changing.…”

mentioning

confidence: 99%

“…Stocking () proposed three models allowing test takers to (1) change their responses at the end of the test subject to a maximum number of revisions, (2) revise their responses freely but only within each test section, or (3) revise responses only within each item set associated with a common stimulus. Papanastasiou () evaluated a rearrangement procedure that rearranged and skipped certain items with answer changes in order to better estimate the examinees’ abilities. In this method, examinees were only allowed to make a small number of answer changes.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Evaluation of a New Method for Providing Full Review Opportunities in Computerized Adaptive Testing—Computerized Adaptive Testing With Salt

Cui¹,

Liu

He³

et al. 2018

J Educational Measurement

View full text Add to dashboard Cite

Allowing item review in computerized adaptive testing (CAT) is getting more attention in the educational measurement field as more and more testing programs adopt CAT. The research literature has shown that allowing item review in an educational test could result in more accurate estimates of examinees’ abilities. The practice of item review in CAT, however, is hindered by the potential danger of test‐manipulation strategies. To provide review opportunities to examinees while minimizing the effect of test‐manipulation strategies, researchers have proposed different algorithms to implement CAT with restricted revision options. In this article, we propose and evaluate a new method that implements CAT without any restriction on item review. In particular, we evaluate the new method in terms of the accuracy on ability estimates and the robustness against test‐manipulation strategies. This study shows that the newly proposed method is promising in a win‐win situation: examinees have full freedom to review and change answers, and the impacts of test‐manipulation strategies are undermined.

show abstract

Section: Discussionmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Evaluation of a New Method for Providing Full Review Opportunities in Computerized Adaptive Testing—Computerized Adaptive Testing With Salt

Cui¹,

Liu

He³

et al. 2018

J Educational Measurement

View full text Add to dashboard Cite

show abstract

“…An item revision in CATs is not possible [7,71,72], because the item selection in CATs is based on the responses already given. Hence, changing responses retrospectively may impact the measurement precision, which results in larger standard errors [69,[73][74][75][76][77][78]. Therefore, allowing item revision within CATs has been controversially discussed in the literature, even if some contributions encountered this measurement problem (see, e.g., [66,77,[79][80][81][82]).…”

Section: Test Anxietymentioning

confidence: 99%

Item Parameter Estimation in Multistage Designs: A Comparison of Different Estimation Approaches for the Rasch Model

Steinfeld

Robitzsch

2021

Psych

View full text Add to dashboard Cite

There is some debate in the psychometric literature about item parameter estimation in multistage designs. It is occasionally argued that the conditional maximum likelihood (CML) method is superior to the marginal maximum likelihood method (MML) because no assumptions have to be made about the trait distribution. However, CML estimation in its original formulation leads to biased item parameter estimates. Zwitser and Maris (2015, Psychometrika) proposed a modified conditional maximum likelihood estimation method for multistage designs that provides practically unbiased item parameter estimates. In this article, the differences between different estimation approaches for multistage designs were investigated in a simulation study. Four different estimation conditions (CML, CML estimation with the consideration of the respective MST design, MML with the assumption of a normal distribution, and MML with log-linear smoothing) were examined using a simulation study, considering different multistage designs, number of items, sample size, and trait distributions. The results showed that in the case of the substantial violation of the normal distribution, the CML method seemed to be preferable to MML estimation employing a misspecified normal trait distribution, especially if the number of items and sample size increased. However, MML estimation using log-linear smoothing lea to results that were very similar to the CML method with the consideration of the respective MST design.

show abstract

“…For instance, if the maximum Fisher information method was employed, the item information for the nonoptimal item would decrease when evaluated at the new interim ability estimate. Loss of information would lead to a decrease in estimation precision, as test information is inversely proportional to the standard error of ability estimate (Papanastasiou, 2005). Furthermore, estimation precision is related to the frequency of item review.…”

Section: Consequences Of Allowing Item Review In Cat Programmentioning

confidence: 99%

The Block Item Pocket Method for Reviewable Multidimensional Computerized Adaptive Testing

Zhe

Chen

Xin

2020

Applied Psychological Measurement

View full text Add to dashboard Cite

Most computerized adaptive testing (CAT) programs do not allow item review due to a decrease in estimation precision and aberrant manipulation strategies. In this article, a block item pocket (BIP) method that combines the item pocket method with the successive block method to realize reviewable CAT was proposed. A worst-case but still reasonable answering strategy and the Wainer-like manipulation strategy were simulated to evaluate the estimation precision of reviewable unidimensional computerized adaptive testing (UCAT) and multidimensional computerized adaptive testing (MCAT) under a series of BIP settings. For both UCAT and MCAT, it was found that the estimation precision of the BIP method improved as the number of blocks increased or the item pocket size decreased under the reasonable strategy. The BIP method was more effective in handling the Wainer-like strategy. With the help of block design, the BIP method can still maintain acceptable estimation precision under slightly large total IP size conditions. These results suggested that the BIP method was a reliable solution for both reviewable UCAT and MCAT.

show abstract

Item Review and the Rearrangement Procedure: Its process and its results

Cited by 5 publications

References 15 publications

Evaluation of a New Method for Providing Full Review Opportunities in Computerized Adaptive Testing—Computerized Adaptive Testing With Salt

Evaluation of a New Method for Providing Full Review Opportunities in Computerized Adaptive Testing—Computerized Adaptive Testing With Salt

Item Parameter Estimation in Multistage Designs: A Comparison of Different Estimation Approaches for the Rasch Model

The Block Item Pocket Method for Reviewable Multidimensional Computerized Adaptive Testing

Contact Info

Product

Resources

About