The Test of English as a Foreign Language (TOEFL) was administered to a group of 71 native speakers of English enrolled as freshmen at a western state university. Although as a group they scored low on ACT English–their mean fell at the 29th percentile relative to the ACT norms group of college‐bound high school seniors–their scores on all five parts of TOEFL, as well as on the total, were considerably above the mean of 34,774 foreign applicants to U. S. colleges. Moreover, their distributions were extremely narrow and highly skewed negatively, indicating that the test was much too easy for them and confirming the hypothesis that while TOEFL discriminates adequately among foreign students, it is, consistent with its design, inadequate for differentiating among native Americans. Further support to this conclusion was found in the low correlation (.64) between TOEFL and ACT English, a test which is expressly designed to differentiate among native English speakers. Finally, plots of item difficulties for the American and foreign groups revealed a clear item x group interaction. This was interpreted to signify that the items had different “meaning” for the two groups. The plots also showed that all but 17 out of the 270 items in the five parts of TOEFL were easier for the American group than for the foreign group; and within this group of 17 items, 15 represented a type of item content which afforded a ready explanation of their unusual behavior. These data all appear to point in the same direction and, in sum, support the hypothesis that TOEFL avoids the kinds of discriminations that are not intended for it. Consequently, these data are taken as clear evidence of the construct validity of the test.
A two-factor analysis of variance with multiple measurements on one factor was conducted among the 40 items of the Vocabulary test of TOEFL for six language groups. All sources of variance were found to be significant beyond the one per cent level. Of particular interest was the item x group interaction which was examined by analyzing the item difficulty plots for each language group against a spaced sample of all candidates taking this form of the test at its first formal administration. A measure of the deviation of each item from the central tendency of the plot was developed, expressing the degree to which the item was especially difficult or especially easy for a particular language group relative to the other items. A distribution of these measures is given for each of the six language groups.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.