JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.. International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics.
SummaryThe aim of this paper is to relate and extend some recent work on chi-square goodness-of-fit tests. There is no discussion of any problems which are specifically associated with more than one categorical variable. The main topics are the effect of estimation on chi-square and its partitions and their relation to Neyman's smooth goodness-of-fit tests, and the effect of grouping a univariate distribution according to the disposition of the sample on the distribution of the chi-sqcuare statistic and on the smooth test statistic.
General DiscussionThis paper was prepared as a contribution to a symposium of recent work on chi-square goodness-of-fit tests. The topics covered are therefore but a small selection of those which a complete review would consider and their choice is a personal matter. Moreover their interest is largely theoretical although not, it is hoped, of an unpractical nature. For the chi-square test is not only the oldest of the non-trivial significance tests, and one of the most widely used, but it is also basic in statistics-many important concepts arose from its study and an understanding of it is a necessary background for the study of other statistical problems, pure and applied. In this section a general discussion will be given of the areas touched in the paper which begins formally in Section 2.z (observed expected)", expected though often decried, continues to be used more than any of its competitors, and for good reasons. For the true multinomial situation, the likelihood ratio statistics 2-(observed) log observed) expected have some theoretical advantages but for large samples they become equivalent and for small samples their behavior is similar on the evidence available. Lindley has suggested in the discussion to Watson [19581 that the likelihood ratio statistic lends itself to an analysis of information (in the sense of Shannon) corresponding to the familiar partitioning of X2 and that the relative powers of these two criteria should be investigated, possibly by electronic computation. The most striking fact in this whole area is the satisfactory, but imperfect, way both criteria are approximated by the chi-square distribution despite their irregular and discrete distributions for small samples. Much study has been devoted to the question of when the chi-square approximation is adequate. Alternatively, improvements to the chi-square approximation have been sought and sonie references to this work are given at the end of Section 3. It is just possible that other statistics, with similar powers, may be better approximated, a problem tha...