“…Subsequently, by fitting an IRT‐graded response model to a calibration sample we allowed the relevant MFQ and harmonized PHQ‐2 items to be differently related to the underlying construct of depression—including DIF by study on one item. This is a substantial improvement over other potential methods that are used for scoring such as proportion, sum or z ‐score (Curran et al ., ; Gorter et al ., ; Griffith et al ., ; Gross et al ., ). Next, IRT longitudinal equating through incorporation of known item parameters for the harmonization and scoring of longitudinal and cross‐sectional ordinal data together with latent growth models, such as a piecewise linear model, were demonstrated that they can be implemented simultaneously.…”