Short comparative interrupted times series (CITS) designs are increasingly being used in education research to assess the effectiveness of school-level interventions. These designs can be implemented relatively inexpensively, often drawing on publicly available data on aggregate school performance. However, the validity of this approach hinges on a variety of assumptions and design decisions that are not clearly outlined in the literature. This article aims to serve as a practice guide for applied researchers when deciding how and whether to use this approach. We begin by providing an overview of the assumptions needed to estimate causal effects using school-level data, common threats to validity faced in practice and what effects can and cannot be estimated using school-level data. We then examine two analytic decisions researchers face in practice when implementing the design: correctly modeling the pretreatment functional form, which is modeling the preintervention trend, and selecting comparison cases. We then illustrate the use of this design in practice drawing on data from the implementation of the school improvement grant (SIG) program in Ohio. We conclude with advice for applied researchers implementing this design.
A cutscore of -0.25 SD appears useful in predicting the best combination of false negatives (5.8%) and false positives (12.5%) with overall accuracy of classification of 81.7%.
This study operationalizes four measures of instructional differentiation: one for Grade 2 English language arts (ELA), one for Grade 2 mathematics, one for Grade 5 ELA, and one for Grade 5 mathematics. Our study evaluates their measurement properties of each measure in a large field experiment: the Indiana Diagnostic Assessment Tools Study, which included two consecutive cluster randomized trials (CRTs) of the effects of interim assessments on student achievement. Each log was designed to measure instructional practices as they were implemented for eight randomly selected students in the participating teachers' classrooms. A total of 592 teachers from 127 schools took part in this study. Logs were administered 16 times in each experiment. Item responses to the logs were scaled using the Rasch model and reliability estimates for the differentiation measures were evaluated at the log level (observations within teachers), the teacher level, and the school level. Estimated reliability was above .70 for each of the log-and teacher-level measures. At the school level, reliability estimates were lower for Grade 5 ELA and mathematics. The variance between teachers and schools on the scaled differentiation measures was substantially less than within-teacher variation. These results provide preliminary evidence that teacher instructional logs may provide useful measures of instructional differentiation in elementary grades at multiple levels of aggregation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.