This paper summarizes three studies that illustrate how application of the three‐parameter logistic test model helped solve three relatively intractable testing problems. The three problems are designing a multi‐purpose test, evaluating a multi‐level test, and equating a test on the basis of pretest statistics. Examples use information about tests in various College Board testing programs. The first two studies demonstrate the value of the information function and relative efficiency. The last study is an application of basic item characteristic curve theory and shows the potential usefulness of pre‐equating.
The purpose of this study was to compare five methods of computing school effectiveness indices (SEls) from longitudinal data. The five methods were within-school regression, within-school regression corrected for the unreliability of measurement, mean difference scores, average individual residual scores (based on the regression of student output scores on student input scores), and school residual scores (based on the regression of school mean output scores on school melill input scores). The Further, all of the school effectiveness indices were highly stable across samples, except for the school effectiveness indices for initially high-scoring students. Finally, predictions from nonlongitudinal data fUrnished reasonable estimates of school effectiveness as measured by one of the school effectiveness indices.The methods should be tried out at other grade levels. . Further, the stabilities of the various school effectiveness indices aCrOSS years should be studied. A COMPARISON OF SELECTED SCHOOL EFFECTIVENESS MEASURES BASED ON LONGITUDINAL DATAlGary L. Marco Educational Testing ServiceWith the recent emphasis in education upon program bUdgeting and cost effectiveness has come a renewed interest in school system evaluation.However, how school effectiveness should be estimated is unclear. The purpose of this study is to compare selected methods of estimating school effectiveness from longitudinal data.Various techniques have been suggested to generate school effectiveness indices. Indices commonly used are the average performance of students in a particular grade in the school and the difference between the performance of students in the school and the performance of a national norm group.Although these two methods have been widely used, they have a fatal flaw:neither takes into account the differences in initial status.In some studies partial control over differing input levels has been achieved by holding socioeconomic status (SES) constant. SchoOlS serving students from low SES families have been compared with one another, as have schools serving students from more advantaged families. The school effectiveness index in such a case is the deviation of performance from the average of the schools serving like children. This index is often employed with data collected at one point in time for a given grade level, such as statewide testing program data. Ability scores have sometimes been partialed out of achievement scores in an attempt to control for initial differences.In this case, the difference between the actual performance and the predicted performance has been used as a measure of school effectiveness. Unfortunately, mean three years later) or at the student level. Unless the student group enrolled in a lower grade has remained intact over the interim period, the school data will be based on a group that is somewhat different from the group of students that was present at both data collection points. To distinguish these "unmatched" groups from groups that are composed of the same students, the form...
Recently Marco and Abdel‐fattah (1991) reported newly established relationships between scores on the enhanced American College Testing Program (ACT) Assessment and scores on the College Board's Scholastic Aptitude Test (SAT). The current report provides a detailed description of the methodology that was used to develop the “concordance” tables reported in that study. Fourteen large universities provided data on applicants who had taken both the enhanced ACT Assessment and the SAT. The ACT Composite scores and the SAT‐verbal and SAT‐mathematical (SAT‐V + M) scores used in the comparability study came from ACT Assessment test editions administered from October 1989 to June 1990 and SAT editions administered from March 1989 to June 1990. The total sample consisted of 40,492 students. A subsample of 40,051 students who look the enhanced ACT Assessment and the SAT no more than 217 days apart was also used in the comparability study. The equipercentile procedure, a curvilinear procedure, was used to scale the scores. Steps were taken to ensure the accuracy of the conversions by (1) weighting the scores to reduce or eliminate the effect of the time differential between ACT and SAT testings or (2) smoothing the score distributions before scaling. The score conversions derived by applying the weighting procedures provided score adjustments that were in the direction implied by the length of time between testings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.