Using longitudinal data from a cohort of middle school students from a large school district, we estimate separate “value‐added” teacher effects for two subscales of a mathematics assessment under a variety of statistical models varying in form and degree of control for student background characteristics. We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers' performance based on value‐added models can be sensitive to the ways in which student achievement is measured.
This study applies multi-level analysis to student reports of effective teacherstudent interactions in 50 upper elementary school classrooms (N = 594 fourth-and fifth-grade students). Observational studies suggest that teacherstudent interactions fall into three domains: Emotional Support, Classroom Organization, and Instructional Support. Results of multi-level confirmatory factor analyses indicated that a three-factor model fits between-and within-classroom variability in students' reports reasonably well. Multi-level regressions provide some evidence of criterion validity, with student reports at the classroom level related to parallel observations. Both classroom-and student-level student report data were associated with students' reading proficiency and disciplinary referrals. Findings are discussed in terms of implications for future research on student reports of classroom interactions and their practical utility in teacher evaluation and feedback systems.
The Educative Teacher Performance Assessment (edTPA) is a system of standardized portfolio assessments of teaching performance mandated for use by educator preparation programs in 18 states, and approved in 21 others, as part of initial certification for preservice teachers. Because of the high stakes involved for examinees, it is critical that the scores produced and resulting decisions are meaningful and meet robust standards of validity and technical quality for educational measurements. We examined the technical documentation of edTPA and raise serious concerns about scoring design, the reliability of the assessments, and the consequential impact on decisions about edTPA candidates. In light of these findings, we argue that the proposed and actual uses of the edTPA are currently unwarranted on technical grounds.
This study examined the temporal associations of cigarette smoking with prosmoking social influences, academic performance, and delinquency in a cohort of 6,527 adolescents surveyed at ages 13, 16, 18, and 23 years. Prosmoking peer and family influences were risk factors for future smoking throughout adolescence, with family influences perhaps also operating indirectly through the adolescent's exposure to prosmoking peers. There were reciprocal associations of youth smoking with parental approval, peer smoking, and poor grades (but not delinquency), with youth smoking emerging as a stronger antecedent than consequence of these psychosocial factors. Few gender differences in these associations were observed. Implications of these findings for efforts to prevent youth smoking are discussed.
With growing interest in the role of teachers as the key mediators between educational policies and outcomes, the importance of developing good measures of classroom processes has become increasingly apparent. Yet, collecting reliable and valid information about a construct as complex as instruction poses important conceptual and technical challenges. This article summarizes the results of two studies that investigated the properties of measures of instruction based on a teacher-generated instrument (the Scoop Notebook) that combines features of portfolios and self-report. Classroom artifacts and teacher reflections were collected from samples of middle school science classrooms and rated along 10 dimensions of science instruction derived from the National Science Education Standards; ratings based on direct classroom observations were used as comparison. The results suggest that instruments that combine artifacts and self-reports hold promise for measuring science instruction with reliability similar to, and sizeable correlations with, measures based on classroom observation. We discuss the implications and lessons learned from this work for the conceptualization, design, and use of artifact-based instruments for measuring instructional practice in different contexts and for different purposes. Artifact-based instruments may illuminate features of instruction not apparent even through direct classroom observation; moreover, the process of structured collection and reflection on artifacts may have value for professional development. However, their potential value and applicability on a larger scale depends on careful consideration of the match between the instrument and the model of instruction, the intended uses of the measures, and the aspects of classroom practice most amenable to reliable scoring through artifacts. We outline a research agenda for addressing unresolved questions and advancing theoretical and practical knowledge around the measurement of instructional practice. ß 2011 Wiley Periodicals, Inc. J Res Sci Teach 49: 2012 Keywords: science education; measurement of instruction; generalizability theory There is growing consensus among researchers and policymakers about the importance of accurate, valid, and efficient measures of instructional practice in science classrooms. Instruction directly or indirectly mediates the success of many school improvement efforts and thus accurate descriptions of what teachers do in classrooms as they attempt to implement reforms is key for understanding ''what works'' in education, and equally importantly, ''how?'' Many educational policies and programs rely on claims about the value of certain practices for improving student outcomes; for example, the No Child Left Behind legislation prompted schools to adopt scientifically based practices to improve the achievement of all students. Similarly, the reform teaching movement often recommends specific approaches to instruction designed to promote higher-level learning. More generally, the National Researc...
We report the results of a pilot validation study of the Quality Assessment in Science Notebook, a portfolio-like instrument for measuring teacher assessment practices in middle school science classrooms. A statewide sample of 42 teachers collected 2 notebooks during the school year, corresponding to science topics taught in the fall and spring. Each notebook was scored on 9 dimensions of assessment practice by 3 trained raters. Our analysis investigated the reliability and validity of notebook ratings, with particular emphasis on identifying key sources of error in the ratings. The results suggest that variation in teacher practice across notebooks (i.e., over time) was more important than idiosyncratic rater inconsistencies as a source of error in the scores. The validity results point to a dominant factor underlying the ratings and some predictive power of notebook ratings on student achievement. We discuss implications of the results for measuring assessment practice through artifacts, drawing conceptual and methodological lessons about our model of assessment practice, the consistency of raters, and the estimation of variance over time with classroom-based measures of instruction.Correspondence should be sent to José Felipe Martínez,
This document and trademark(s) contained herein are protected by law as indicated in a notice appearing later in this work. This electronic representation of RAND intellectual property is provided for noncommercial use only. Permission is required from RAND to reproduce, or reuse in another form, any of our research documents. The RAND Corporation is a nonprofit research organization providing objective analysis and effective solutions that address the challenges facing the public and private sectors around the world. R AND's publications do not necessarily reflect the opinions of its research clients and sponsors.
Limited Electronic Distribution RightsR ® is a registered trademark.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.