“…Yearly increases, or long‐term trends, in test scores may be less clear to interpret because they can result from true improvement in performance or scale/item drift. To tease out one effect from the other, it may be worthwhile to conduct a special equating design to examine scale drift specifically (e.g., Petersen et al., 1983; Puhan, 2008); methods for detecting item drift (Bock, Muraki, & Pfeiffenberger, 1988; Donoghue & Isham, 1998; Guo, Robin, & Dorans, 2017; Zhang & Li, 2016) may be considered as well. If there are different assessments with similar target populations, then true improvement in performance is likely observed in more than one of these assessments.…”