Supervised machine learning (ML) is becoming an influential analytical method in psychology and other social sciences. However, theoretical ML concepts and predictive-modeling techniques are not yet widely taught in psychology programs. This tutorial is intended to provide an intuitive but thorough primer and introduction to supervised ML for psychologists in four consecutive modules. After introducing the basic terminology and mindset of supervised ML, in Module 1, we cover how to use resampling methods to evaluate the performance of ML models (bias-variance trade-off, performance measures, k-fold cross-validation). In Module 2, we introduce the nonlinear random forest, a type of ML model that is particularly user-friendly and well suited to predicting psychological outcomes. Module 3 is about performing empirical benchmark experiments (comparing the performance of several ML models on multiple data sets). Finally, in Module 4, we discuss the interpretation of ML models, including permutation variable importance measures, effect plots (partial-dependence plots, individual conditional-expectation profiles), and the concept of model fairness. Throughout the tutorial, intuitive descriptions of theoretical concepts are provided, with as few mathematical formulas as possible, and followed by code examples using the mlr3 and companion packages in R. Key practical-analysis steps are demonstrated on the publicly available PhoneStudy data set ( N = 624), which includes more than 1,800 variables from smartphone sensing to predict Big Five personality trait scores. The article contains a checklist to be used as a reminder of important elements when performing, reporting, or reviewing ML analyses in psychology. Additional examples and more advanced concepts are demonstrated in online materials ( https://osf.io/9273g/ ).
In their paper “Vintage factor analysis with varimax performs statistical inference”, Rohe and Zeng (R&Z; 2022) demonstrate the usefulness of principal component analysis with varimax rotation (PCA+VR), a combination they call vintage factor analysis. The authors show that PCA+VR can be used to estimate factor scores and factor loadings, if a certain leptokurtic condition is fulfilled that can be assessed by simple visual diagnostics. In a side result, they also imply that PCA+VR is able to estimate factor scores even if the latent factors are correlated. In our commentary, we briefly discuss some implications of these results for psychological research and note that the suggested diagnostics of “radial streaks” might give less clear results in typical psychological applications. The commentary includes extensive electronic supplemental materials, containing a data example and a small simulation on estimating correlated factors, that can be found at https://osf.io/5symf/.
Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (N = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/).
__Abstract__: Psychological assessment often requires concrete decisions, e.g. whether a person is “below the norm” in some psychological domain. It is still common that practitioners directly compare the test score with some theoretical norm value (e.g., one standard deviation below the mean). In a literature review, we show that all German textbooks on Psychological Assessment recommend taking measurement uncertainty of psychological tests into account, for example by using critical differences, hypothesis tests, or confidence intervals. However, these recommendations resemble heuristics without a comprehensible rationale on how to choose necessary parameters (e.g., the appropriate significance or confidence level). Statistical decision theory is a mathematical framework for making rational decisions. Although once en vogue in psychology (cf. Cronbach & Gleser, 1965), it receives little attention today. When viewed from a decision theoretic perspective, one can show the implicit assumptions of current decision heuristics. For example, using two-sided hypothesis tests and confidence intervals with significance level alpha = 0.05 implies that type I errors are 39 times as costly as type II errors. In this paper, we give a short introduction to decision theory and use this framework to discuss the implication of current assessment practices. We also present a small survey of clinical neuropsychologists, who reported different representations of their internal cost ratio for a fictitious assessment scenario. Although the practitioners’ cost ratios varied, the majority chose less extreme ratios than the common heuristics would imply. We argue that psychological assessment would benefit from explicitly considering decision theoretic implications in practice and outline possible future directions. // __Zusammenfassung__: Die psychologische Einzelfalldiagnostik erfordert oft konkrete Entscheidungen, z. B. ob eine Person in einem psychologischen Bereich „unterdurchschnittlich" ist. In der Praxis kommt es immer noch vor, dass das Testergebnis direkt mit einem theoretischen Normwert (z. B. eine Standardabweichung unterhalb des Mittelwerts) verglichen wird. In einem Literaturreview zeigen wir, dass alle deutschen Lehrbücher zur psychologischen Diagnostik empfehlen, die Messunsicherheit von psychologischen Tests zu berücksichtigen, z. B. durch die Verwendung von kritischen Differenzen, Hypothesentests oder Konfidenzintervallen. Diese Empfehlungen ähneln jedoch Heuristiken ohne eine nachvollziehbare Begründung, wie die notwendigen Parameter (z. B. das geeignete Signifikanz- oder Konfidenzniveau) zu wählen sind. Die statistische Entscheidungstheorie ist ein mathematisches Framework, um rationale Entscheidungen zu treffen. Obwohl sie bereits früh in der Psychologie behandelt wurde (vgl. Cronbach & Gleser, 1965), findet sie heute wenig Beachtung. Aus einer entscheidungstheoretischen Perspektive betrachtet, lassen sich die impliziten Annahmen aktueller Entscheidungsheuristiken aufzeigen. Die Verwendung zweiseitiger Hypothesentests und Konfidenzintervalle mit einem Signifikanzniveau von alpha = 0.05 impliziert beispielsweise, dass Fehler 1. Art 39-mal schwerwiegender eingestuft werden als Fehler 2. Art. In diesem Artikel geben wir eine kurze Einführung in die Entscheidungstheorie und nutzen dieses Framework, um die Auswirkungen auf die derzeitige Praxis zu erörtern. Außerdem stellen wir eine Umfrage unter klinischen Neuropsycholog:innen vor, die für ein fiktives Fallbeispiel ihre internen Kostenverhältnisse auf verschiedene Art und Weise angaben. Obwohl die Kostenverhältnisse der Praktiker:innen variierten, wählte die Mehrheit weniger extreme Verhältnisse als die üblichen Heuristiken vermuten ließe. Wir argumentieren, dass die psychologische Einzelfalldiagnostik von einer expliziten Berücksichtigung entscheidungstheoretischer Implikationen in der Praxis profitieren würde und skizzieren mögliche zukünftige Forschungsrichtungen.
Longitudinal panels include several thousand participants and variables. Traditionally, psychologists analyze only a few variables — partly because common unregularized linear models perform poorly when the number of variables (p) approaches the number of observations (N). Predictive modeling methods can be used when N ≈ p situations arise in psychological research. We illustrate these techniques on exemplary variables from the German GESIS Panel, while describing the choice of preprocessing, model classes, resampling techniques, hyperparameter tuning, and performance measures. In analyses with about 2000 subjects and variables each, we predict panelists’ gender, sick days, an evaluation of president Trump, income, life satisfaction, and sleep satisfaction. Elastic net and random forest models were compared to dummy predictions in benchmark experiments. While good performance was achieved, the linear elastic net performed similar to the nonlinear random forest. Elastic nets were refitted to extract the ten most important predictors. Their interpretation validates our approach and further modeling options are discussed. Code at https://osf.io/zpse3/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.