Multilevel structural equation modeling (multilevel SEM) has become an established method to analyze multilevel multivariate data. The first useful estimation method was the pseudobalanced method. This method is approximate because it assumes that all groups have the same size, and ignores unbalance when it exists. In addition, full information maximum likelihood (ML) estimation is now available, which is often combined with robust chi-squares and standard errors to accommodate unmodeled heterogeneity (MLR). In addition, diagonally weighted least squares (DWLS) methods have become available as estimation methods. This article compares the pseudobalanced estimation method, ML(R), and two DWLS methods by simulating a multilevel factor model with unbalanced data.The simulations included different sample sizes at the individual and group levels and different intraclass correlation (ICC). The within-group part of the model posed no problems. In the between part of the model, the different ICC sizes had no effect. There is a clear interaction effect between number of groups and estimation method. ML reaches unbiasedness fastest, then the two DWLS methods, then MLR, and then the pseudobalanced method (which needs more than 200 groups). We conclude that both ML(R) and DWLS are genuine improvements on the pseudobalanced approximation. With small sample sizes, the robust methods are not recommended.
We investigate the relation between speed and accuracy within problem solving in its simplest non-trivial form. We consider tests with only two items and code the item responses in two binary variables: one indicating the response accuracy, and one indicating the response speed. Despite being a very basic setup, it enables us to study item pairs stemming from a broad range of domains such as basic arithmetic, first language learning, intelligence-related problems, and chess, with large numbers of observations for every pair of problems under consideration. We carry out a survey over a large number of such item pairs and compare three types of psychometric accuracy-response time models present in the literature: two ‘one-process’ models, the first of which models accuracy and response time as conditionally independent and the second of which models accuracy and response time as conditionally dependent, and a ‘two-process’ model which models accuracy contingent on response time. We find that the data clearly violates the restrictions imposed by both one-process models and requires additional complexity which is parsimoniously provided by the two-process model. We supplement our survey with an analysis of the erroneous responses for an example item pair and demonstrate that there are very significant differences between the types of errors in fast and slow responses.
Recent advances in interpretable Machine Learning (iML) and eXplainable AI (XAI) construct explanations based on the importance of features in classification tasks. However, in a highdimensional feature space this approach may become unfeasible without restraining the set of important features. We propose to utilize the human tendency to ask questions like "Why this output (the fact) instead of that output (the foil)?" to reduce the number of features to those that play a main role in the asked contrast. Our proposed method utilizes locally trained one-versusall decision trees to identify the disjoint set of rules that causes the tree to classify data points as the foil and not as the fact. In this study we illustrate this approach on three benchmark classification tasks.
The speed–accuracy trade-off (SAT) suggests that time constraints reduce response accuracy. Its relevance in observational settings—where response time (RT) may not be constrained but respondent speed may still vary—is unclear. Using 29 data sets containing data from cognitive tasks, we use a flexible method for identification of the SAT (which we test in extensive simulation studies) to probe whether the SAT holds. We find inconsistent relationships between time and accuracy; marginal increases in time use for an individual do not necessarily predict increases in accuracy. Additionally, the speed–accuracy relationship may depend on the underlying difficulty of the interaction. We also consider the analysis of items and individuals; of particular interest is the observation that respondents who exhibit more within-person variation in response speed are typically of lower ability. We further find that RT is typically a weak predictor of response accuracy. Our findings document a range of empirical phenomena that should inform future modeling of RTs collected in observational settings.
With the advent of computers in education, and the ample availability of online learning and practice environments, enormous amounts of data on learning become available. The purpose of this paper is to present a decade of experience with analyzing and improving an online practice environment for math, which has thus far recorded over a billion responses. We present the methods we use to both steer and analyze this system in real-time, using scoring rules on accuracy and response times, a tailored rating system to provide both learners and items with current ability and difficulty ratings, and an adaptive engine that matches learners to items. Moreover, we explore the quality of fit by means of prediction accuracy and parallel item reliability. Limitations and pitfalls are discussed by diagnosing sources of misfit, like violations of unidimensionality and unforeseen dynamics. Finally, directions for development are discussed, including embedded learning analytics and a focus on online experimentation to evaluate both the system itself and the users' learning gains. Though many challenges remain open, we believe that large steps have been made in providing methods to efficiently manage and research educational big data from a massive online learning system. Notes for Practice• We analyzed an online adaptive practice environment for arithmetic, actively used by over 400,000 primary school children in the Netherlands.• Adaptive practice is achieved by continuously tracking both student abilities and item difficulties, and matching students to items.• A unidimensional adaptive algorithm, separately employed within each domain (e.g., multiplication), takes care of tracking abilities and difficulties.• We show that the obtained unidimensional ability and difficulty estimates are, to a large extent, reliable and accurate.• Moreover, we explore the many sources of misfit, or violations of the unidimensionality assumption, including differences in response processes (fast and slow responders) and response strategies (erroneous strategies that work for clusters of items).• To advance the field of learning analytics, we call for active analytics such as exemplified in this paper. Learning analytics must actively help direct a student towards his or her educational objective by means of embedded analytics that not only analyze the student, but also shape their learning path (such as the discussed adaptive algorithm) and includes experiments that ensure changes to the system have the desired effect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.