A Nonparametric Framework for Comparing Trends and Gaps Across Tests

Ho, Andrew Dean

doi:10.3102/1076998609332755

Cited by 45 publications

(85 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As Ho (2009) shows, NAEP gap trends calculated on the ݀ ௧ metric vary wildly from the Basic to the Proficient cut score, and neither aligns with ݀ with any regularity. We replicated this analysis with 2009 NAEP data, and the degree of cut-score dependence remains substantial.…”

Section: Traditional Achievement Gap Measures and Their Shortcomingsmentioning

confidence: 99%

“…Moreover, this common effect size will be equal to Cohen's ݀. Formal demonstrations of the logic of this transformation are widespread (e.g., Hedges and Olkin, 1985;Ho, 2009). …”

Section: Traditional Achievement Gap Measures and Their Shortcomingsmentioning

confidence: 99%

“…The ݀ ௧ approach helps to address the confounding of PAC-based gaps and the locations of cut scores, but it rests on the assumption that the score distributions are normal with equal variance (or share a transformation that renders them so, Ho, 2008Ho, , 2009). As Ho (2009) shows, NAEP gap trends calculated on the ݀ ௧ metric vary wildly from the Basic to the Proficient cut score, and neither aligns with ݀ with any regularity.…”

Section: Traditional Achievement Gap Measures and Their Shortcomingsmentioning

confidence: 99%

“…When the scale-dependence of gap statistics is a concern, gaps can be derived from transformation-invariant representations like the probability-probability (PP) plot (Ho, 2009;Livingston, 2006;Wilk & Gnanadesikan, 1968). The PP plot is best described by considering the two Cumulative Distribution Functions (CDFs), ‫ܨ‬ ‫)ݔ(‬ and ‫ܨ‬ ‫,)ݔ(‬ that return the proportions of students ( and ) at or below a given score ‫ݔ‬ in groups ܽ and ܾ, respectively.…”

Section: An Ordinal Framework For Gap Trend Reportingmentioning

confidence: 99%

“…A less confounded approach to comparing gaps at different levels of the distribution would be to compare gaps in higher or lower percentiles. When distributions are normal with equal variance, PAC-based gaps will vary whereas percentile-based gaps will not (Holland, 2002).The ݀ ௧ approach helps to address the confounding of PAC-based gaps and the locations of cut scores, but it rests on the assumption that the score distributions are normal with equal variance (or share a transformation that renders them so, Ho, 2008Ho, , 2009). As Ho (2009) shows, NAEP gap trends calculated on the ݀ ௧ metric vary wildly from the Basic to the Proficient cut score, and neither aligns with ݀ with any regularity.…”

mentioning

confidence: 99%

See 4 more Smart Citations

Estimating Achievement Gaps From Test Scores Reported in Ordinal “Proficiency” Categories

Ho¹,

Reardon

2012

Journal of Educational and Behavioral Statistics

View full text Add to dashboard Cite

Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This paper introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered categories alone. These methods hold two practical advantages over alternative achievement gap metrics. First, they require only categorical proficiency data, which are often available where means and standard deviations are not. Second, they result in gap estimates that are invariant to score scale transformations, providing a stronger basis for achievement gap comparisons over time and across jurisdictions. We find three candidate estimation methods that recover full-distribution gap estimates well when only censored data are available. Researchers selecting an achievement gap metric face three issues. First, average-based gaps-effect sizes or simple differences in averages-are variable under plausible transformations of the test score scale (Ho, 2007; Reardon, 2008a;Seltzer, Frank, & Bryk, 1994;Spencer, 1983). Second, gaps based on percentages above a cut score, such as differences in "proficiency" or passing rates, vary substantially under alternative cut scores (Ho, 2008;Holland, 2002). Third, researchers often face a practical challenge: Although they may wish to use an average-based gap metric, the necessary data may be unavailable.This last situation has become common even as the reporting requirements of the No Child Left Behind Act (NCLB) have led to large amounts of easily accessible test score data.The emphasis of NCLB on measuring proficiency rates over average achievement has led states and districts to report "censored data": test score results in terms of categorical achievement levels, typically given labels like "below basic," "basic," "proficient," and "advanced. Traditional Achievement Gap Measures and Their ShortcomingsA test score gap is a statistic describing the difference between two distributions.Typically, the target of inference is the difference between central tendencies. Three "traditional" gap metrics dominate this practice of gap reporting. The first is the test score scale,where gaps are most often expressed as a difference in group averages. For a student test score, ܺ, a typically higher scoring reference group, ܽ, and a typically lower scoring focal group, ܾ, the difference in averages, ݀ ௩ , follows:The second traditional metric expresses the gap in terms of standard deviation units. This metric allows for standardized interpretations when the test score scale is unfamiliar and affords aggregation and comparison across tests with differing score scales (Hedges & Olkin, 1985).Sometimes described as Cohen's ݀, this effect size expresses ݀ ௩ in terms of a quadratic average of both groups' standard deviations, ‫ݏ‬ and ‫ݏ‬ . Although a weighted average of variances or a single standard deviation could also be used in the denomina...

show abstract

Section: Traditional Achievement Gap Measures and Their Shortcomingsmentioning

confidence: 99%

“…Moreover, this common effect size will be equal to Cohen's ݀. Formal demonstrations of the logic of this transformation are widespread (e.g., Hedges and Olkin, 1985;Ho, 2009). …”

Section: Traditional Achievement Gap Measures and Their Shortcomingsmentioning

confidence: 99%

Section: Traditional Achievement Gap Measures and Their Shortcomingsmentioning

confidence: 99%

Section: An Ordinal Framework For Gap Trend Reportingmentioning

confidence: 99%

mentioning

confidence: 99%

See 3 more Smart Citations

Estimating Achievement Gaps From Test Scores Reported in Ordinal “Proficiency” Categories

Ho¹,

Reardon

2012

Journal of Educational and Behavioral Statistics

View full text Add to dashboard Cite

show abstract

Variability in Percentage Above Cut Scores Due to Discreteness in Score Scale

Lü

2017

ETS Research Report Series

View full text Add to dashboard Cite

For standard‐ or criterion‐based assessments, the use of cut scores to indicate mastery, nonmastery, or different levels of skill mastery is very common. As part of performance summary, it is of interest to examine the percentage of examinees at or above the cut scores (PAC) and how PAC evolves across administrations. This paper shows that discreteness in score scales can affect the PAC statistics considerably, especially when the test is short and when there are high examinee density and low item density near the cut score on the reporting score scale. This paper also includes recommendations on how to adjust the PAC statistics when they are used in trend analyses.

show abstract

What Can Repeated Cross‐sectional Studies Tell Us About Student Growth?

Almond

Sinharay

2012

ETS Research Report Series

View full text Add to dashboard Cite

To answer questions about how students' proficiencies are changing over time, educational researchers are looking for data sources that span many years. Clearly, for answering questions about student growth, a longitudinal study—in which a single sample is followed over many years—is preferable to repeated cross‐sectional samples—in which a separate sample is taken every year. Repeated cross‐sectional studies, such as the National Assessment of Educational Progress (NAEP), however, are often readily available. Repeated cross‐sectional studies conflate several sources of variability (differences in the initial status of individuals, individual differences in the growth curves, and individual‐by‐measurement‐occasion differences) in ways that are not easily separated. Although repeated cross‐sectional studies can provide information about the growth of the averages, the growth of the averages corresponds to the average of the growth curves only in very restricted circumstances. This paper reviews the literature on modeling growth with an eye to characterizing the limitations of repeated cross‐sectional studies and understanding the sensitivity of the results to key decisions (particularly, choices of cut points). In most cases, repeated cross‐sectional studies should be used to confirm and contextualize the results of more targeted longitudinal studies.

show abstract

A Nonparametric Framework for Comparing Trends and Gaps Across Tests

Cited by 45 publications

References 31 publications

Estimating Achievement Gaps From Test Scores Reported in Ordinal “Proficiency” Categories

Estimating Achievement Gaps From Test Scores Reported in Ordinal “Proficiency” Categories

Variability in Percentage Above Cut Scores Due to Discreteness in Score Scale

What Can Repeated Cross‐sectional Studies Tell Us About Student Growth?

Contact Info

Product

Resources

About