The reliability of six Minnesota Multiphasic Personality Inventory-Second edition (MMPI-2) computer-based test interpretation (CBTI) programs was evaluated across a set of 20 commonly appearing MMPI-2 profile codetypes in clinical settings. Evaluation of CBTI reliability comprised examination of (a) interrater reliability, the degree to which raters arrive at similar inferences based on the same CBTI profile and (b) interprogram reliability, the level of agreement across different CBTI systems. Profile inferences drawn by four raters were operationalized using q-sort methodology. Results revealed no significant differences overall with regard to interrater and interprogram reliability. Some specific CBTI/profile combinations (e.g., the CBTI by Automated Assessment Associates on a within normal limits profile) and specific profiles (e.g., the 4/9 profile displayed greater interprogram reliability than the 2/4 profile) were interpreted with variable consensus (α range = .21-.95). In practice, users should consider that certain MMPI-2 profiles are interpreted more or less consensually and that some CBTIs show variable reliability depending on the profile.
Reflecting the common use of the MMPI-2 to provide diagnostic considerations, computer-based test interpretations (CBTIs) also typically offer diagnostic suggestions. However, these diagnostic suggestions can sometimes be shown to vary widely across different CBTI programs even for identical MMPI-2 profiles. The present study evaluated the diagnostic reliability of 6 commercially available CBTIs using a 20-item Q-sort task developed for this study. Four raters each sorted diagnostic classifications based on these 6 CBTI reports for 20 MMPI-2 profiles. Two questions were addressed. First, do users of CBTIs understand the diagnostic information contained within the reports similarly? Overall, diagnostic sorts of the CBTIs showed moderate inter-interpreter diagnostic reliability (mean r = .56), with sorts for the 1/2/3 profile showing the highest inter-interpreter diagnostic reliability (mean r = .67). Second, do different CBTIs programs vary with respect to diagnostic suggestions? It was found that diagnostic sorts of the CBTIs had a mean inter-CBTI diagnostic reliability of r = .56, indicating moderate but not strong agreement across CBTIs in terms of diagnostic suggestions. The strongest inter-CBTI diagnostic agreement was found for sorts of the 1/2/3 profile CBTIs (mean r = .71). Limitations and future directions are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.