Almost everything related to the assessment and evaluation of teaching in the United States is undergoing restructuring. Purposes and uses, data sources, analytic methods, assessment contexts, and policy are all being developed, refined, and reconsidered within a cauldron of research, development, and policy activity. For example, the District of Columbia made headlines when it announced the firing of 241 teachers based, in part, on poor performance results from their new evaluation system, IMPACT (Turque, 2010). The Bill and Melinda Gates Foundation has funded the Measures of Effective Teaching (MET) study, a $45 million study designed to test the ways in which a range of measures including scores on observation protocols, student engagement data, and value-added test scores might be combined into a single teaching evaluation metric (Bill and Melinda Gates Foundation, 2011a). The Foundation is also spending $290 million in four communities in intensive partnerships to reform how teachers are recruited, developed, rewarded, and retained (Bill and Melinda Gates Foundation, 2011b). In addition to pressure from districts and private funders, unions have also pressed for revised standards of teacher evaluation (e.g., American Federation of Teachers [AFT], 2010). Perhaps the most consequential contemporary effort is the federally funded Race to the Top Fund that encourages states to implement teacher evaluation systems based on multiple measures with a significant component based on students' academic growth to achieve funding (U.S. Department of Education, 2010). These and other recent research and policy developments are changing the way the assessment of teaching is understood. The goal of this chapter is to provide an overview and structure to facilitate readers' understanding of the emerging landscape and attendant assessment issues.As well described in a number of recent reports, current evaluation processes suffer from a number of problems (Toch & Rothman, 2008;Weisberg, Sexton, Mulhern, & Keeling, 2009). For example, the New Teachers Project surveyed evaluation practices in several districts large and small and found that teachers were almost all rated highly. In systems that used binary ratings (i.e., satisfactory or unsatisfactory), almost 99% of teachers were rated satisfactory. To complicate matters, the same administrators who gave all teachers high marks also recognized that staff members varied greatly in performance and that some were actually poor teachers. In addition to an inability to sort teachers, current processes generally do not give teachers useful information to improve their practice, and policymakers do not believe the credibility of the evaluation process (Weisberg et al., 2009).Measures of teaching should be seen from a validity perspective, and thus, it is critical to begin with the purpose and use of the assessment. As Messick (1989) argued, validity is not an inherentThe authors thank Andrew Croft, Laura Goe, Heather Hill, Daniel McCaffrey, and Joan Snowden for their careful review of...