Slinde and Linn (1978) investigated the use of the Rasch model (Rasch, 1960, pp. 62-125) for the problem of vertical equating. In that study, difficult and easy subtests were identified for a sample of 1,365 students. The easy subtest was used to divide the total sample into high, middle, and low ability groups. Separate Rasch-item calibrations of the combined easy and difficult items were conducted for the high and low groups. The two sets of item parameter estimates, one from the high and one from the low group, were then each used to equate the difficult and easy tests for all three groups of examinees.The mean log ability estimates were quite similar for the easy and hard tests, when the parameter estimates were used with the group from which they were obtained. That is, estimates obtained from the high-ability group yielded nearly equal mean log ability estimates for that group on the difficult and easy tests; similar results were found for the low-ability group. When estimates obtained from one group were applied to a different group, however, substantial differences between log ability estimates on the difficult and easy tests were found.Slinde and Linn concluded from these results that the Rasch model did not provide a satisfactory means of vertically equating the easy and difficult tests. Vertical equating would, of course, be realized if the data fit the model. Thus, failure to achieve satisfactory vertical equating may be attributed to a lack of fit between the model and the data. As would be expected from the overall results, there was a lack of model-data fit. The statement that the overall chi-square values were nonsignificant (Slinde & Linn, 1978, pp. 27 and 28) was incorrect. The statement should have read, "For each of the three estimation groups, the overall chi-square, which was based on the entire data set, was significant at the .01 level." Significant Chi-square values are consistent with our indications in other places that there was a lack of model fit. With existing standardized tests, some lack of fit is to be expected. This is precisely why investigations of the robustness of the model with actual data are needed.It was acknowledged that the test of the Rasch model in the Slinde and Linn study was a severe one. This was because the three groups used were more widely separated in ability level than typical groups of interest would be, such as three adjacent grade levels. Also, the difference in the level of difficulty between the easy and difficult tests was more extreme than is likely to be found in actual practice with tests to be vertically equated.Lord (personal communication) has suggested another limitation of the Slinde and Linn results. He noted that the claim that the Rasch difficulty estimates are independent of the group used requires that the assignment of examinees to groups be independent of the errors of measurement of the items being calibrated. Since group assignment was based on the easy test items in the Slinde and Linn study, there obviously is a dependence