The chapter introduces three typical introspective measures (think‐aloud reports, immediate recalls, and stimulated recalls) that have been used in language assessment (LA) research mainly to elicit test takers' and test raters' thinking processes. First, it briefly touches upon how and why introspection became a legitimate research target in the humanities in general, then in applied linguistics, and later in the LA field. Because of the combined influences of behaviorism and structuralism after World War II, the LA field had long viewed the use of qualitative data such as introspection in research as somewhat suspicious. However, under the influence of information processing perspectives and the sociocultural orientation in applied linguistics, and most importantly through the redefinition of “test validity” in language testing itself, some recent LA studies have begun using introspection measures for the purpose of investigating how the test products (e.g., scores) are actually brought about. This evolution can also be related to the epistemological changes advocated by some researchers in the concept of assessment itself (e.g., dynamic assessment).
The three measures listed above are operationally defined, and some possible advantages and disadvantages of each method when applied to LA research are discussed. Selected studies illustrate how these methods have been adopted in past research. The dos and don'ts that prospective researchers should bear in mind when planning to use these methods in their studies are also presented. The chapter concludes by mentioning several possible directions that future studies could take and that might benefit not only those involved in test development and administration but also those concerned with language education in general.