We illustrate the usefulness of person-fit methodology for personality assessment. For this purpose, we use person-fit methods from item response theory. First, we give a nontechnical introduction to existing person-fit statistics. Second, we analyze data from Harter's (1985) Self-Perception Profile for Children (Harter, 1985) in a sample of children ranging from 8 to 12 years of age (N = 611) and argue that for some children, the scale scores should be interpreted with care and caution. Combined information from person-fit indexes and from observation, interviews, and self-concept theory showed that similar score profiles may have a different interpretation. For some children in the sample, item scores did not adequately reflect their trait level. Based on teacher interviews, this was found to be due most likely to a less developed self-concept and/or problems understanding the meaning of the questions. We recommend investigating the scalability of score patterns when using self-report inventories to help the researcher interpret respondents' behavior correctly.
In recent studies, different methods were proposed to investigate invariant item ordering (IIO), but practical IIO research is an unexploited field in questionnaire construction and evaluation. In the present study, the authors explored the usefulness of different IIO methods to analyze personality scales and clinical scales. From the authors' analyses, it was clear that for clinical scales consisting of items that cover a limited range of ''symptoms,'' the IIO property is an unrealistic assumption. For scales that consist of items that cover a broader range of item severity, IIO research can provide useful information. However, removing an item because it violates the assumption of IIO may be problematic because it can affect the construct that is measured. Finally, the authors advise researchers to always use plots of item rest-score regressions to interpret IIO results.
Keywords invariant item ordering, nonparametric item response theory, test constructionIn psychological testing, it is often assumed that the item ordering according to severity (or mean score) established at the group level is the same for persons at different individual trait levels. For example, when measuring psychological distress, an item such as ''auditory hallucinations'' represents a much higher level of psychological
Purpose: Sample size in Mokken scales is mostly studied on simulated data, reflected in the lack of consideration of sample size in most Mokken scaling studies. Recently, [Straat, J. H., van der Ark, L. A., & Sijtsma, K. (2014). Minimum sample size requirements for Mokken scale analysis. Educational and Psychological Measurement, 74, 809-822] provided minimum sample size requirements for Mokken scale analysis based on simulation. Our study uses real data from the Warwick-Edinburgh Mental Well-Being Scale (N = 8463) to assess whether these hold. Methods: We use per element accuracy to evaluate the impact of sample size, with scaling coefficients and confidence intervals around scale, item and item pair scalability coefficients. Results: Per element accuracy, scalability coefficients, and confidence intervals around scalability coefficients are sensitive to sample size. The results from Straat et al. were not replicated; depending on the main goal of the research, sample sizes ranging from > 250 to > 1000 are needed. Conclusions: Using our pragmatic approach, some practical recommendations are made regarding sample sizes for studies of Mokken scaling.
The authors investigated the psychometric properties of the subscales of the Self-Perception Profile for Children with item response theory (IRT) models using a sample of 611 children. Results from a nonparametric Mokken analysis and a parametric IRT approach for boys (n = 268) and girls (n = 343) were compared. The authors found that most scales formed weak scales and that measurement precision was relatively low and only present for latent trait values indicating low self-perception. The subscales Physical Appearance and Global Self-Worth formed one strong scale. Children seem to interpret Global Self-Worth items as if they measure Physical Appearance. Furthermore, the authors found that strong Mokken scales (such as Global Self-Worth) consisted mostly of items that repeat the same item content. They conclude that researchers should be very careful in interpreting the total scores on the different Self-Perception Profile for Children scales. Finally, implications for further research are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.