Do teachers actually follow the grading practices recommended by researchers and textbook authors? where are the discrepancies? What are possible explanations for such discrepancies? What is an agenda for further research on actual classroom grading practices of teachers?
Reliability is the property of a set of test scores that indicates the amount of measurement error associated with the scores. Teachers need to know about reliability so that they can use test scores to make appropriate decisions about their students. The level of consistency of a set of scores can he estimated by using the methods of internal analysis to compute a reliability coefficient. This coefficient, which can range between 0.0 and +1.0, usually has values around 0.50 for teacher‐made tests and around 0.90 for commercially prepared standardized tests. Its magnitude can be affected by such factors as test length, test‐item difficulty and discrimination, time limits, and certain characteristics of the group—extent of their testwiseness, level of student motivation, and homogeneity in the ability measured by the test.
The purpose of this study was to examine the effect of a read aloud testing accommodation on students with and without a learning disability in reading. A sample of 260 midwestern middle school students (24% with a learning disability in reading, and 76% without such a disability) were randomly assigned to two experimental conditions for testing with four tests of the Iowa Tests of Basic Skills. The test conditions were standard administration and reading the tests aloud to the students. Based on a two-way (2 x 2) analysis of variance, with test administration and student status as the two fixed factors, the students with learning disabilities in reading, as well as those without, exhibited statistically significant gains with the read aloud test administration. Interaction effects were not significant. Implications of these results for the read aloud accommodation are presented.
The performance of two polytomous item response theory models was compared to that of the dichotomous three-parameter logistic model in the context of equating tests composed of testlets. For the polytomous models, testlet scores were used to eliminate the effect of the dependence among within-testlet items. Traditional equating methods were used as criteria for both. The equating methods based on polytomous models were found to produce results that more closely agreed with the results of traditional methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.