Hierarchical linear modeling (HLM) is typically used in the social sciences to model data from clustered settings, such as students nested within classrooms. However, not all multilevel data are purely hierarchical in nature. For example, students can be nested within the neighborhoods in which they live and within the schools they attend. But, most likely, students from a given neighborhood do not all attend the same school and students from a given school do not all reside within the same neighborhood. Because neighborhoods are not nested within schools nor vice-versa, the two are said to be cross-classified.Cross-classified random effects modeling (CCREM) is used to model data from these non-hierarchical contexts.vi While use of CCREM has increased in various disciplines such as medicine, it is seldom used in educational research. CCREM is mentioned in most multilevel modeling textbooks (for example, Raudenbush & Bryk, 2002;Hox, 2002; Snijders & Boskers, 1999). However, it remains infrequently used, most likely because the models are technically sophisticated and can be somewhat difficult to interpret.Little research has been conducted assessing when it is necessary to use CCREM, so this dissertation involved several studies. A Monte Carlo Simulation Study was conducted in order to investigate potential factors affecting the need to use CCREM as well as the impact of ignoring cross-classification. As a follow-up study, CCREM was applied to a large-scale national data set in order to provide insight into the potential effects of ignoring the cross-classified data structure.Results of both studies indicated that when using HLM instead of CCREM, the fixed effect estimates were unaffected but the standard error estimates associated with the variables modeled incorrectly were biased. In addition, the estimates of the variance components displayed bias. The observed bias was related to the proportion of the total variance that was between each cross-classified factor, the sample size, and the similarity of the cross-classified factors. Implications and limitations are discussed and suggestions for future research are presented.vii
A reliability generalization (RG) study was conducted for the Marlowe-Crowne Social Desirability Scale (MCSDS). The MCSDS is the most commonly used tool designed to assess social desirability bias (SDB). Several short forms, consisting of items from the original 33-item version, are in use by researchers investigating the potential for SDB in responses to other scales. These forms have been used to measure a wide array of populations. Using a mixed-effects model analysis, the predicted score reliability for male adolescents was .53 and the reliability for men's responses was lower than that for women's. Suggestions are made concerning the necessity for further psychometric evaluations of the MCSDS.
A reliability generalization (RG) study was conducted for the Marlowe-Crowne Social Desirability Scale (MCSDS). The MCSDS is the most commonly used tool designed to assess social desirability bias (SDB). Several short forms, consisting of items from the original 33-item version, are in use by researchers investigating the potential for SDB in responses to other scales. These forms have been used to measure a wide array of populations. Using a mixed-effects model analysis, the predicted score reliability for male adolescents was .53 and the reliability for men's responses was lower than that for women's. Suggestions are made concerning the necessity for further psychometric evaluations of the MCSDS.
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change, test level, test content, and item format. As a follow-up to the real data analysis, a simulation study was performed to assess the effect of item position change on equating. Results from this study indicate that item position change significantly affects change in RID. In addition, although the test construction procedures used in the investigated state seem to somewhat mitigate the impact of item position change, equating results might be impacted in testing programs where other test construction practices or equating methods are utilized.
Assessments labeled as formative have been offered as a means to improve student achievement. But labels can be a powerful way to miscommunicate. For an assessment use to be appropriately labeled “formative,” both empirical evidence and reasoned arguments must be offered to support the claim that improvements in student achievement can be linked to the use of assessment information. Our goal in this article is to support the construction of such an argument by offering a framework within which to consider evidence‐based claims that assessment information can be used to improve student achievement. We describe this framework and then illustrate its use with an example of one‐on‐one tutoring. Finally, we explore the framework's implications for understanding when the use of assessment information is likely to improve student achievement and for advising test developers on how to develop assessments that are intended to offer information that can be used to improve student achievement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.