Sines (6) used the Pearson correlation coefficient t o measure reliability of the individual five point sub-scales in the L-M Fergus Falls Behavior Rating Scale(3-4 , (hereafter referred to as the L-M Scale). His conclusion was that, although the total or average score on the L-M Scale had satisfactory reliability, as reported by Lucero and M e~e r (~9 4), the reliability of the individual five point sub-scales was too low for them to be used individually. The purpose of this note is to demonstrate that a five point scale will result in such a coarse grouping of the information, and/or such a limitation in the range of responses of the raters, that the Pearson correlation coefficient will cause erroneous assessment of the reliability of the sub-scales.Table 1 summarizes data that would be typically gathered to assess the reliability of the individual five point sub-scales for the L-M Scale. Two raters independently rated 94 patients in a convalescent ward, and two other raters rated 109 patients in a regressed ward. Note first that, whatever the final decision as to the appropriate measure of reliability for the several individual scales, for each scale the reliability is such that the averages for the two wards are significantly different.In fact the ward means differ by more than one point for every sub-scale except E and G, and it can be seen that the per cent agreement is high enough so that it might be expected that the individual scales are reliable enough to be of some use in classifying individual patients when the discrimination required is on the level of convalescent versus regressed.In Table 1, the reliability of the individual scales is described in two ways: by the per cent agreement between the two raters, and also by the Pearson correlation coefficient between the two raters for the whole ward. The per cent agreement is the measure that seems most directly related to the basic concept of reliability for a ranked, classificatory variable such as the rating scale variable. It is the measure most consistent with Technical Recommendations") and is the only measure of reliability proposed by Goodman and Kruskal@) in their general discussion of measures of correlation for classificatory variables. Since the per cent agreement is not commonly used to describe reliability, and also necessitates a definition of 'agreement' in terms of the amount of discrepancy on the scale if a single percentage is to be used, it is tempting to use instead the Pearson correlation coefficient. However, it can be seen in Table 1 that there is no correspondence between the correlation coefficients and the per cent agreement. The most outstanding disagreement between the two is on the K scale. The per cent perfect agreement is 82 in the convalescent ward and 38 in the regressed ward, while the corresponding correlations are -.090and .722 respectively.The discrepancies between the two measures of reliability should not be surprising. The meaningfulness of the correlation coefficient is lost if the measurements are coarsely grou...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.