2013
DOI: 10.1177/0265532213492969
|View full text |Cite
|
Sign up to set email alerts
|

Examining testlet effects in the TestDaF listening section: A testlet response theory modeling approach

Abstract: Testlets are subsets of test items that are based on the same stimulus and are administered together. Tests that contain testlets are in widespread use in language testing, but they also share a fundamental problem: Items within a testlet are locally dependent with possibly adverse consequences for test score interpretation and use. Building on testlet response theory (Wainer, Bradlow, & Wang, 2007), the listening section of the Test of German as a Foreign Language (TestDaF) was analyzed to determine whether, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
28
0
6

Year Published

2015
2015
2021
2021

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(37 citation statements)
references
References 62 publications
3
28
0
6
Order By: Relevance
“…In his study conducted with the PISA results, DeMars (2006) made ability predictions using a two-factor model, testlet model, independent item model and multiple categorized item model, and estimated the correlations between the ability predictions obtained from different models to be close to 1.00. Eckes (2014), Min and He (2014) and Bradlow et al (1999) also stated that the ability parameters were similar in their studies.…”
Section: Discussionmentioning
confidence: 56%
See 2 more Smart Citations
“…In his study conducted with the PISA results, DeMars (2006) made ability predictions using a two-factor model, testlet model, independent item model and multiple categorized item model, and estimated the correlations between the ability predictions obtained from different models to be close to 1.00. Eckes (2014), Min and He (2014) and Bradlow et al (1999) also stated that the ability parameters were similar in their studies.…”
Section: Discussionmentioning
confidence: 56%
“…However, as a result of the use of testlets based on a common stimulant, the assumption of local independence is disrupted. Many studies in literature have shown that the disruption of local independence causes false predictions in the item and ability parameters obtained by standard IRT models (Ackerman, 1987;Chang & Wang, 2010;Eckes, 2014;Ip, 2000;Marais & Andrich, 2008;Monseur, Baye, Lafontaine, & Quittre, 2011;Reese, 1995;Wainer, 1995). In the present study, testlets were focused upon and data sets were analyzed with different IRT based models, with different sets of data from the items in the tenth booklet of PISA 2012, which measured mathematical literacy.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Writing assessment involves more than raters in the process. Factors as examinees and rating scales have also been scrutinized (Eckes, 2005(Eckes, , 2008(Eckes, , 2009(Eckes, , 2012(Eckes, , 2013a(Eckes, , 2013bGoodwin, 2016;Saeidi et al, 2013;Weigle, 1998;Wolfe, 2004, among others). Schaefer (2008) thinks highly of MFRM because "it has shown great promise in the area of performance assessment and rating scale validation because it can analyze sources of variation in test scores beside item difficulty and person ability" (p. 466).…”
Section: Review Of Related Literaturementioning
confidence: 99%
“…A second area of research emphasis has been on the investigation of item and test quality with an eye towards assuring comparability in terms of difficulty, fairness for diverse examinee populations, and construct coverage (e.g., Arras, Müller-Karabil, & Zimmerman, 2013;Eckes, 2008aEckes, , 2013Eckes & Grotjahn, 2006b). Findings from these studies point clearly to the effectiveness of careful item and test development procedures, on the one hand, but perhaps most interestingly they have highlighted the contribution of cutting-edge analytic approaches to quality assurance for operational assessments (e.g., testlet response theory models for determining local item dependencies; see Eckes, 2014). Research in a variety of other areas has also been conducted, and new directions in research are currently being explored by g.a.s.t., all pointing to substantial efforts at maintaining and enhancing the test's quality.…”
Section: Appraisal -Strengthsmentioning
confidence: 99%