Truman L. Kelley scite author profile

It is not intended here to discuss the general problem of item validation, but only that aspect of it thai arises when an upper and lower group are selected to serve as standard groups in the differentiation of test items. It is argued that the more indubitably it is known that the upper group is superior to the lower group, the more definitely can it be concluded that an item is valid by finding that the upper group is more successful in passing it than the lower group. If, in two situations, one in which the upper and lower groups are differentiated with high certainty and the other with little certainty, the proportion of passes (i.e., right answers) in the upper groups are equal and equally superior to the proportion of passes in the lower groups, we should believe that the item represented in the first situation is more valid than that of the second situation.Having available an initial group which is normally distributed with reference to a desired criterion, we set the problem of selecting upper and lower portions of this group which will be most efficient in the study of items, and their selection or rejection. The items in question are capable of two grades only, right or wrong. We further limit the issue by not here considering the interrelationship of items, a matter of first importance when the final test to be constructed is to contain more than one item. It is granted that the problem as set is too constricted to be "real," but it is, nevertheless, believed that its solution is commonly pertinent to the handling of real item selection problems.The writer has stated 1 that twenty-seven per cent should be selected at each extreme to yield upper and lower groups which are most indubitably different with respect to the trait in question. This article does not alter that conclusion but does provide a more available and somewhat improved derivation.Let us be given graduated scores on a test or trait from a sample of size N. For simplicity we shall consider N to be even, so that we may 1 Reported by Milton B. Jensen: The Objective Differentiation of Three Groups

show abstract

An Unbiased Correlation Ratio Measure

Kelley

1935

Proc. Natl. Acad. Sci. U.S.A.

128

View full text Add to dashboard Cite

The properties of the correlation ratio have been very thoroughly studied and reported upon. It has long been a necessary instrument in the study of the nature of regression. The work of Fisher' in 1922 made it a very precise instrument in studying the goodness of fit of second and higher degree regression lines.It, however, lacks a certain desirable simplicity of meaning in that its value, j, obtained from a sample, differs from the population value, i77, not only in a random manner due to the fluctuation of the particular sample, but also in a systematic manner which is a function of the number of arrays in which the data are recorded. This systematic difference between v7 and v is, of course, well known to the expert statistician, and allowed for in his interpretations, as, for example, is automatically the case in the use of the following formula by Fisherwhere N is the number of cases in the sample, k the number of arrays in which the dependent variable is classed, v the ordinary correlation ratio, R the ratio of the standard deviation of the differences between the points upon the regression line used and the means of the arrays of the sample to the standard deviation of the dependent variable (for a linear regression line, R is simply r, the ordinary product-moment correlation coefficient), and x2 is the ordinary x2 distributed nearly in the Pearson type III manner and with a number of degrees of freedom equal to [k -f(R)] in which f(R) is the number of linear restrictions placed upon the frequencies in determining the regression line employed, it equaling 2 in the case in straight line regression, 3 for second degree parabolic regression, etc. Entering a table giving probabilities for values of x2 with the value given by [1] and a number of degrees of freedom equal to [k -f(R)]yields a value P which is the probability that if the true regression, or regression in the population, is of the form assumed, a divergence from it as great as that observed would arise as a matter of chance. Thus P, derived for x2, is an immediately interpretable statistic. We may note the simplicity of several of the other concepts. N = the number of cases in the sample k = the number of arrays 554 PROC. N. A. S.

show abstract

Interpretation of Educational Measurements

Freeman¹,

Kelley²

1928

The American Journal of Psychology

146

View full text Add to dashboard Cite

The reliability coefficient

Kelley

1942

Psychometrika

View full text Add to dashboard Cite

The Reliability of Test Scores

Kelley

1921

The Journal of Educational Research

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Truman L. Kelley

The selection of upper and lower groups for the validation of test items.

An Unbiased Correlation Ratio Measure

Interpretation of Educational Measurements

The reliability coefficient

The Reliability of Test Scores

Contact Info

Product

Resources

About