Although low participation rates have historically been considered problematic in peer nomination research, some researchers have recently argued that small proportions of participants can, in fact, provide adequate sociometric data. The current study used a classical measurement perspective to investigate the internal reliability (Cronbach's a) of peer nomination measures of acceptance, popularity, friendship, prosocial behavior, and overt aggression. Data from 642 participants attending 10 schools were resampled at different participation rates ranging from 5 percent to 100 percent of the original samples. Results indicated that (1) the association between participation rate and Cronbach's a was curvilinear across schools and variables; (2) collecting more data for a given variable (by using unlimited vs. limited nominations, or two vs. one items) was significantly related to higher internal reliability; and (3) certain variables (overt aggression, popularity) were more reliable than others (acceptance, friendship). Implications for future research were discussed.
This simulation study examined a number of computerized adaptive testing (CAT) termination rules based on the item response theory framework. Results showed that longer CATs yielded more accurate trait estimation, but there were diminishing returns with a very large number of items. Standard error termination performed quite well in terms of both administering a small number of items and having high accuracy of trait estimation if the standard error level used was low enough, but it was sensitive to the item bank information structure. Change in estimated θ performed comparably to standard error termination, but was less sensitive to the bank information structure. Fixed-length CATs performed either slightly worse or comparable to their variable-length termination counterparts; previous findings stating that variable-length CATs are biased were the result of artifacts, which are discussed. Recommendations for CAT termination are provided.
This article examines a variety of reliability issues as related to limited nomination sociometric measures. Peer nomination data were collected from 77 sixth grade classrooms. Results showed that, although some single‐item peer nomination measures were relatively reliable, many single‐item peer nomination measures using limited nominations were quite unreliable. Overt aggression nomination items were the only set of single‐item measures where mean classroom reliability estimates were .75 or greater. Combining multiple items led to substantially better reliability, as combining the two least reliable items for a category into a single measure made the composite more reliable than the most reliable single measure. Having more nominators in the sample also increased reliability. The limited nomination items overall tended to be less reliable than similar unlimited nomination items from other studies. The authors end with recommendations for obtaining the most reliable peer nomination data possible from a study.
Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm that accepted or rejected new parameters in a bivariate fashion. Results showed that acceptable estimation of the noncompensatory model required a sample size of 4,000 people, six unidimensional items per dimension, and latent traits that are not highly correlated. Although the data requirements to estimate this model are a bit daunting, future advances in methodology could make this model valuable for modeling multidimensional data where the latent traits are not expected to be highly correlated.
Testing programs often rely on common-item equating to maintain a single measurement scale across multiple test administrations and multiple years. Changes over time, in the item parameters and the latent trait underlying the scale, can lead to inaccurate score comparisons and misclassifications of examinees. This study examined how instability in a scale and the items composing a scale affects item parameter recovery and classification accuracy. Results showed that a Rasch item response theory scale can maintain near baseline recovery properties if the changes in the latent trait over time are small. The Rasch scale also maintained good recovery of item and person parameters if there was equal item drift in both directions. Under conditions of relatively little item drift and small to moderate periodic changes in the latent trait, a Rasch scale may remain stable for 15 years, ±3. Substantial item drift or large changes in the latent trait can dramatically reduce the longevity of the scale.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.