The SAT has been shown to be both culturally and statistically biased against African Americans, Hispanic Americans, and Asian Americans. In this article, Roy Freedle argues for a corrective scoring method, the Revised-SAT (R-SAT), to address the nonrandom ethnic test bias patterns found in the SAT. The R-SAT, which scores only the "hard" items on the test, is shown to reduce the mean-score difference between African American and White SAT test-takers by one-third. Further, the R-SAT shows an increase in SAT verbal scores by as much as 200 to 300 points for individual minority test-takers. Freedle also argues that low-income White examinees benefit from the revised score as well. He develops several cognitive and cultural hypotheses to explain the ethnic regularities in responses to various test items. Freedle concludes by offering some predictions as to how ethnic populations are likely to be affected by the new designs currently being proposed for the SAT, and describes the implications of the R-SAT for increasing minority admission to select colleges.
The purpose of this study is to predict the difficulty of a large sample ( n = 213) of TOEFL reading comprehension items. A related purpose was to examine whether text and text-by-item interaction variables play a significant role in predicting item difficulty. It was argued that evidence favouring the construct validity of multiple-choice reading test formats requires signincant contributions from these particular predictor variables. Details of item predictability and construct validity were explored by evaluating two hypoth eses : 1) that multiple-choice reading comprehension tests are sensitive to 12 categories of sentential and/or discourse variables found to influence compre hension processes in the experimental literature; and 2) that many of these categories of variables identified in the first hypothesis contribute significant independent variance in predicting item difficulty. For the first hypothesis, correlational analyses confirmed the importance of 11 out of the 12 categor ies, while stepwise regression analyses, accounting for up to 58% of the variance, provided some support for the second hypothesis. The pattern of predictors showed that text and text-by-item variables accounted for most of the variance, thereby providing evidence favouring the construct validity of the TOEFL reading items.
The primary goal of this project was to examine the predictability of SAT reading item difficulty (equated delta) for main idea items, and collectively, the predictability of three major reading item types: main idea, inference and explicit statement items. A secondary purpose in predicting item difficulty was to contrast the responses of high verbal and low verbal ability examinees. Primary attention was paid to studying 110 main idea reading items and their associated passages. However, additional results are reported for 285 reading items taken from 34 disclosed SAT forms which represented a wider range of reading item types. The percent variance of main idea item difficulty accounted for varied from 46% to 59% depending upon the particular analysis. The predictability of all three reading item types (n = 285) varied from 21% to 29%, depending upon the particular analysis. Details of item predictability were explored by evaluating several hypotheses. Results indicated that (1) multiple‐choice reading items are sensitive to variables similar to those reported in the experimental literature on comprehension, (2) many of these variables provide significant independent predictive information in regression analyses, (3) the placement (early versus middle of text) of relevant main idea information affects item difficulty, and (4) considerable agreement between SAT and GRE reading predictability was found. Additional results contrast the performance of high and low ability groups.
This study examines the predictability of GRE reading item difficulty (equated delta) for three major reading item types: main idea, inference and explicit statement items. Each item type is analyzed separately, using 110 GRE reading passages and their associated 244 reading items; selective analyses of 285 SAT reading items are also presented. Stepwise regression analyses indicates that the percentage of GRE delta variance accounted for varied from 20' to 52% depending upon the item type.Details of item predictability were explored by evaluating several hypotheses. Results indicated that (1) multiple-choice reading items are sensitive to variables similar to those reported in the experimental literature on comprehension, (2) many of these variables provide independent predictive information in regression analyses, and (3) substantial agreement between GRE and SAT reading predictability was found.
The purpose of the current study was to predict the difficulty of a large sample (n = 337) of Test of English as a Foreign Language (TOEFL®) listening comprehension items that dealt with the minitalk (listening) passages. Four item types were examined: Main Idea items (consisting of two subtypes: explicit and implicit gist), Supporting Idea items (also called explicit detail items), and two types of Inference items (one subtype called pure inference items and a second subtype called inference‐application items – both of these subtypes are also called implicit detail items). A related purpose was to examine whether particular types of predictors (i.e., text and text‐associated variables) play a significant role in predicting item difficulty. We maintain that evidence favoring construct validity in part requires significant contributions from these text and text‐associated predictor variables. This paper also explores the hypothesis that multiple‐choice listening comprehension tests are sensitive to many sentential and discourse variables found to influence comprehension processes in the experimental language comprehension literature. Earlier work with reading items (Freedle & Kostin, 1993) and the current study of listening (minitalk) items show that the majority of sentential and discourse variables identified in our review of the experimental language literature were significantly related to item difficulty within TOEFL's multiple‐choice format. Furthermore, contrary to predictions that we attributed to several critics of multiple‐choice tests, the pattern of correlational results showed that there was a significant relationship between item difficulty and the text and text‐related variables. We Interpreted this pattern of results as modest evidence supporting our claim that multiple‐choice TOEFL listening and reading items yield measures that are consistent with one definition of a construct valid test of comprehension. That is, since critics have pointed out that at least reading items can often be correctly responded to without the need to read the text passage, item variables (and not text or text‐related variables) should be the most prominent predictors (correlationally as well as in a regression sense) of item difficulty. Since the contrary relationship was in fact found in several analyses of the minitalk items, it was concluded that this outcome provides some modest evidence favoring the construct validity of the minitalk passages and their associated items. Various stepwise and hierarchical regression analyses showed that many of these text and text‐related variables provide independent contributions in predicting listening Item difficulty. More specifically, apart from the correlational results, the following stepwise linear regressions results were obtained. For the full sample of 337 listening items (containing all four item types) with equated delta (an index of item difficulty) as the dependent variable, we found 35% (p < .0001) of the variance of listening item difficulty could be accoun...
The purpose of the current study is to predict the difficulty (equated delta) of a large sample (n=21.3) of TOEFL reading comprehension items. (Only main idea, inference, and supporting statement items were sampled.) A related purpose was to examine whether text and text‐related variables play a significant role in predicting item difficulty; we argued that evidence favoring construct validity would require significant contributions from these particular predictor variables. In addition, details of item predictability were explored by evaluating two hypotheses: (1) that multiple‐choice reading comprehension tests are sensitive to many sentential and discourse variables found to influence comprehension processes in the experimental literature, and (2) that many of the variables identified in the first hypothesis contribute significant independent variance in predicting item difficulty. The great majority of sentential and discourse variables identified in our review of the experimental literature were found to be significantly related to item difficulty within TOEFL's multiple‐choice format. Furthermore, contrary to predictions which we attributed to critics of multiple‐choice tests, the pattern of correlational results showed that there is a significant relationship between item difficulty and the text and text‐related variables. We took this as evidence supporting our claim that multiple–choice reading items yield construct valid measures of comprehension. That is, since critics have pointed out that reading items can often be correctly answered without reading of the text passage, this seems to imply that item variables (not text nor text‐related variables) should fbe prominent predictors of reading item difficulty. Since the contrary relationship was found, we concluded that this provides evidence favoring construct validity. We found, further, in several stepwise linear regression analyses, that many of these text and taxt‐related variables provide independent contributions in predicting reading item difficulty. This was interpreted as providing additional support for construct validity. More specifically, apart from the correlational results, the following stepwise linear regressions results were obtained. For the full sample of 213 items, and where equated delta (an index of item difficulty) is the dependent variable, we found 33 percent (p < .0001) of the variance of item difficulty could be accounted for by sight variables. All sight variables reflected significant and independent contributions due solely to text and text/item overlap variables. This result provided evidence favoring constract validity of the TOEFL reading comprehension items. We also conducted a separate analysis of a subset (n=98) of the full set of 213 items to examine the possible statistical effect of nesting in the original sample, (Nesting occurs when several items relating to the same passage are analysed together; a non‐nested subset is formed when only one item per passage is used.) Eleven variables accounted for 58 per...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.