In the past decade, a number of methodologies have been proposed for observation in the classroom. Generally, this research has focused on the use of one instrument and has rarely reported results from validation investigations. The current study, however, employed two direct observation instruments concurrently within two reading programs-Breaking the Code and an Eclectic Program-in middle school resource rooms. A momentary time sample of task engagement and an event record of discrete student responses were employed in six classrooms representing the two programs. A multimethod validation process was implemented, focusing on treatment and criterion-related validities. Many findings were program-specific, with differences lost or diluted when data were combined across programs. An argument is presented for structuring observation in a manner sensitive to classroom activity structures.The purpose of this article is to describe the validity of two instruments used to measure student classroom behaviors within middle school (grades 6-8) learning disabilides classrooms: a 15-second momentary time sample (MTS) of task engagement and a 2-minute event record (ER) of discrete academic behaviors. The two instruments employed different time sampling techniques and different metrics, namely "engagement rate" (percentage of class time) and "response rate" (occurrences per minute). The results from these instruments were used to differentiate between two remedial language arts programs that differed widely in curriculum and teaching procedures. In one program, Breaking the Code (Lebo, Hughs, Thomas, & Gurren, 1975), teachers employed scripts, a very controlled sequence of skills and activities, and an integrated blend of reading, spelling, and writing words and letter combinations. In the Eclectic Program, an amalgamation of materials was employed, with no explicit sequence of skills and activities, and with an emphasis on oral and silent reading of text, rather than writing and spelling. Three general research questions addressed the validity of the two observational measures: (a) their sensitivity to reflecting instructional differences between the two programs, (b) their relationship with each other, and (c) their relationship with nonobservation criteria of student reading achievement and workbook scores.As Hoge (1985) noted in a review of validity of direct observation instruments, several methods for validating new instruments are available to researchers, although they are not often attempted. He defined three types of validity: treatat PENNSYLVANIA STATE UNIV on June 25, 2015 sed.sagepub.com Downloaded from