Summary We propose a method for high‐dimensional curve clustering in the presence of interindividual variability. Curve clustering has longly been studied especially using splines to account for functional random effects. However, splines are not appropriate when dealing with high‐dimensional data and can not be used to model irregular curves such as peak‐like data. Our method is based on a wavelet decomposition of the signal for both fixed and random effects. We propose an efficient dimension reduction step based on wavelet thresholding adapted to multiple curves and using an appropriate structure for the random effect variance, we ensure that both fixed and random effects lie in the same functional space even when dealing with irregular functions that belong to Besov spaces. In the wavelet domain our model resumes to a linear mixed‐effects model that can be used for a model‐based clustering algorithm and for which we develop an EM‐algorithm for maximum likelihood estimation. The properties of the overall procedure are validated by an extensive simulation study. Then, we illustrate our method on mass spectrometry data and we propose an original application of functional data analysis on microarray comparative genomic hybridization (CGH) data. Our procedure is available through the R package curvclust which is the first publicly available package that performs curve clustering with random effects in the high dimensional framework (available on the CRAN).
Current theoretical models and empirical research suggest that sensorimotor control and feedback processes may guide time perception and production. In the current study, we investigated the role of motor control and auditory feedback in an interval-production task performed under heightened cognitive load. We hypothesized that general associative learning mechanisms enable the calibration of time against patterns of dynamic change in motor control processes and auditory feedback information. In Experiment 1, we applied a dual-task interference paradigm consisting of a finger-tapping (continuation) task in combination with a working memory task. Participants (nonmusicians) had to either perform or avoid arm movements between successive key presses (continuous vs. discrete). Auditory feedback from a key press (a piano tone) filled either the complete duration of the target interval or only a small part (long vs. short). Results suggested that both continuous movement control and long piano feedback tones contributed to regular timing production. In Experiment 2, we gradually adjusted the duration of the long auditory feedback tones throughout the duration of a trial. The results showed that a gradual shortening of tones throughout time increased the rate at which participants performed tone onsets. Overall, our findings suggest that the human perceptual-motor system may be important in guiding temporal behavior under cognitive load.
The problem of estimating the baseline signal from multisample noisy curves is investigated. We consider the functional mixed effects model, and we suppose that the functional fixed effect belongs to the Besov class. This framework allows us to model curves that can exhibit strong irregularities, such as peaks or jumps for instance. The lower bound for the L 2 minimax risk is provided, as well as the upper bound of the minimax rate, that is derived by constructing a wavelet estimator for the functional fixed effect. Our work constitutes the first theoretical functional results in multisample non parametric regression. Our approach is illustrated on realistic simulated datasets as well as on experimental data.
In this paper, a suitable and interpretable diagnosis statistical model is proposed to predict the Non-Alcoholic Steatohepatitis (NASH) from near infrared spectrometry data. In this disease, unknown patients profiles are expected to lead to different diagnosis. The model has then to take into account the heterogeneity of the data and the dimension of the spectrometric data. To this end, we propose to fit a mixture on the joint distribution of the diagnosis binary variable and the covariates selected in the spectra. Because of the high dimension of the data, a penalized maximum likelihood estimator is considered. In practice, a twofold penalty on both regression coefficients and covariance parameters is imposed. Automatic selection criteria such as the AIC and BIC are used to select the amount of shrinkage and the number of clusters. Performance of the overall procedure is evaluated through a simulation study and its application on the NASH data set is analysed. The model leads to higher prediction performance than competitive methods and provides highly interpretable results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.