In order to compete in a global economy, students are going to need resources and curricula focusing on critical thinking and reasoning in science. Despite awareness for the need for complex reasoning, American students perform poorly relative to peers on international standardized tests measuring complex thinking in science. Research focusing on learning progressions is one effort to provide more coherent science curricular sequences and assessments that can be focused on complex thinking about focal science topics. This paper describes an empirically driven, five-step process to develop a three-year learning progression focusing on complex thinking about biodiversity. Our efforts resulted in empirical results and work products including: (1) a revised definition of learning progressions, (2) empirically-driven, three year progressions for complex thinking about biodiversity, (3) an application of statistical approaches for the analysis of learning progression products; (4) Hierarchical Linear Modeling results demonstrating significant student achievement on complex thinking about biodiversity, and (4) Growth Model results demonstrating strengths and weaknesses of the first version of our curricular units. The empirical studies present information to inform both curriculum and assessment development. For curriculum development, the role of learning progressions as templates for the development of organized sequences of curricular units focused on complex science is discussed. For assessment development, learning progression-guided assessments provide a greater range and amount of information that can more reliably discriminate between students of differing abilities than a contrasting standardized assessment measure that was also focused on biodiversity content.
Despite recent shifts in research emphasizing the value of carefully designed experiments, the number of studies of teacher professional development with rigorous designs has lagged behind its student outcome counterparts. We outline a framework for the design of group randomized trials (GRTs) with teachers' knowledge as the outcome and consider mathematics and reading knowledge outcomes designed to assess the types of content problems that teachers encounter in practice. To estimate design parameters, we draw on a national sample of teachers for mathematics and a state Reading First sample to estimate for reading. Our results suggest that there is substantial clustering of teachers' knowledge within schools and professional development GRTs will likely need increased sample sizes to account for this clustering.
This study investigated the relationship of teachers' reading knowledge with students' reading achievement using a direct teacher knowledge assessment rather than indirect proxies (e.g., certification). To address the inequitable distribution of teachers' knowledge resulting from differences in teachers' backgrounds and the disparities in how schools attract and cultivate knowledge, the study developed multilevel propensity score methods to identify comparable teachers on the basis of both teacher and school backgrounds. Results suggest that schools are complexly associated with differences in teachers' knowledge and that comparisons which ignore the relevance of schools may be misleading. By comparing teachers with similar personal and school backgrounds, results show measured knowledge is significantly associated with students' achievement in reading comprehension but not word analysis. The findings support policies which leverage school capacities to develop the specialized knowledge needed for teaching reading.
The purpose of this study was to examine third-grade teachers' support for students' vocabulary learning in high poverty schools characterized by underachievement in reading. We examined the prevalence and nature of discourse actions teachers used to support vocabulary learning in different literacy lessons (e.g., phonics); these actions varied in the cognitive demands placed on the students. Results showed that teachers rarely engaged students in cognitively challenging work on word meanings. Various lesson features and student and teacher characteristics were associated with teachers' support for students' vocabulary learning (e.g., teachers' knowledge about reading). A major finding was that the extent of teachers' support of their students' vocabulary learning was significantly related to gains in reading comprehension across the year. JOANNE F. CARLISLE, PhD, is professor emerita in the
Purpose
To examine the reliability and attributable facets of variance within an entrustment-derived workplace-based assessment system.
Method
Faculty at the University of Cincinnati Medical Center internal medicine residency program (a 3-year program) assessed residents using discrete workplace-based skills called observable practice activities (OPAs) rated on an entrustment scale. Ratings from July 2012 to December 2016 were analyzed using applications of generalizability theory (G-theory) and decision study framework. Given the limitations of G-theory applications with entrustment ratings (the assumption that mean ratings are stable over time), a series of time-specific G-theory analyses and an overall longitudinal G-theory analysis were conducted to detail the reliability of ratings and sources of variance.
Results
During the study period, 166,686 OPA entrustment ratings were given by 395 faculty members to 253 different residents. Raters were the largest identified source of variance in both the time-specific and overall longitudinal G-theory analyses (37% and 23%, respectively). Residents were the second largest identified source of variation in the time-specific G-theory analyses (19%). Reliability was approximately 0.40 for a typical month of assessment (27 different OPAs, 2 raters, and 1–2 rotations) and 0.63 for the full sequence of ratings over 36 months. A decision study showed doubling the number of raters and assessments each month could improve the reliability over 36 months to 0.76.
Conclusions
Ratings from the full 36 months of the examined program of assessment showed fair reliability. Increasing the number of raters and assessments per month could improve reliability, highlighting the need for multiple observations by multiple faculty raters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.