2000
DOI: 10.1111/j.1745-3984.2000.tb01078.x
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Methods of Estimating Conditional Standard Errors of Measurement for Testlet‐Based Test Scores Using Simulation Techniques

Abstract: The primary purpose of this study was to investigate the appropriateness and implication of incorporating a testlet definition into the estimation of procedures of the conditional standard error of measurement (SEM) for tests composed of testlets. Another purpose was to investigate the bias in estimates of the conditional SEM when using item‐based methods instead of testlet‐based methods. Several item‐based and testlet‐based estimation methods were proposed and compared. In general, item‐based estimation metho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
14
0

Year Published

2002
2002
2018
2018

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 29 publications
0
14
0
Order By: Relevance
“…If the average within-testlet correlations are higher than the between-testlet correlations, reliability estimates derived from dichotomous scoring of the items will be positively biased. Lee and Frisbie (1999) also computed average within-and between-testlet correlations in their generalizability theory approach to assessing the reliability of tests composed of testlets. When testlet scoring was used on the sets of items in their research, the difference between the computed passage reliability and the generalizability coefficient was small, supporting the position that testlet scoring was the appropriate level of scoring to use, as compared to dichotomous item scoring.…”
Section: Lid Assessment Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…If the average within-testlet correlations are higher than the between-testlet correlations, reliability estimates derived from dichotomous scoring of the items will be positively biased. Lee and Frisbie (1999) also computed average within-and between-testlet correlations in their generalizability theory approach to assessing the reliability of tests composed of testlets. When testlet scoring was used on the sets of items in their research, the difference between the computed passage reliability and the generalizability coefficient was small, supporting the position that testlet scoring was the appropriate level of scoring to use, as compared to dichotomous item scoring.…”
Section: Lid Assessment Methodsmentioning
confidence: 99%
“…This is an especially serious problem in computerized adaptive testing (CAT), where the standard error of the estimate (SEE) is often used as the termination criterion. Because the SEE is the reciprocal of the test information, overestimating test information will result in premature termination of the test (Fennessy, 1995;Lee, 2000). Ferrara, Huynh, and Baghi (1997), Ferrara, Huynh, and Michaels (1999), and Yen (1993) investigated several potential causes of LID on performance assessments and found similar problems with respect to reliability estimation.…”
mentioning
confidence: 99%
“…Furthermore, the interdependencies in items will result in improper estimates of important test score characteristics such as reliability, test information, or conditional standard error of measurement (Ferrara, Huynh, & Michaels, 1999;Lee, 2000;Sireci, Thissen, & Wainer, 1991;Thissen, Steinberg, & Mooney, 1989;Wainer, 1995;Yen, 1993). These problems are due to local item dependence (LID).…”
Section: Please Scroll Down For Articlementioning
confidence: 95%
“…Graduate School of Education, The University of Tokyo/Japan Society for the Promotion of Science 113-0033 7-3-1 Tel 090-5229-5738 E-mail toudou@p.u-tokyo.ac.jp (Lord & Novick, 1968) , θ Lee, 2000;Sireci, Thissen, & Wainer, 1991;2010, 2012Bradlow, Wainer, & Wang (1999) Chen & Wang (2007 Bayesian testlet model BTM (Wainer, Bradlow, & Wang, 2007) constant interaction model (Hoskens & De Boeck, 1997) θ j P j (θ) 2 2PLM (Braeken, 2011;Braeken, Tuerlinckx, & De Boeck, 2007;Ip, 2010;Ip, Smith, & De Boeck, 2009) …”
Section: Item Response Theory Irt Local Independencementioning
confidence: 99%