A framework is developed for inference concerning the covariance operator of a functional random process, where the covariance operator itself is an object of interest for statistical analysis. Distances for comparing positive-definite covariance matrices are either extended or shown to be inapplicable to functional data. In particular, an infinite-dimensional analogue of the Procrustes size-and-shape distance is developed. Convergence of finite-dimensional approximations to the infinite-dimensional distance metrics is also shown. For inference, a Fréchet estimator of both the covariance operator itself and the average covariance operator is introduced. A permutation procedure to test the equality of the covariance operators between two groups is also considered. Additionally, the use of such distances for extrapolation to make predictions is explored. As an example of the proposed methodology, the use of covariance operators has been suggested in a philological study of cross-linguistic dependence as a way to incorporate quantitative phonetic information. It is shown that distances between languages derived from phonetic covariance functions can provide insight into the relationships between the Romance languages.
Summary. Cardiovascular ischaemic diseases are one of the main causes of death all over the world. In this class of pathologies, a quick diagnosis is essential for a good prognosis in reperfusive treatment. In particular, an automatic classification procedure based on statistical analysis of teletransmitted electrocardiograph ('ECG') traces would be very helpful for an early diagnosis. This work presents an analysis of ECG traces, either physiological or pathological, of patients whose 12-lead prehospital ECG has been sent to the 118 Dispatch Center in Milan by life-support personnel. The statistical analysis starts with a preprocessing step, where functional data are reconstructed from noisy observations and biological variability is removed by a non-linear registration procedure. Then, a multivariate functional k -means clustering procedure is carried out on reconstructed and registered ECGs and their first derivatives. Hence, a new semi-automatic diagnostic procedure, based solely on the ECG morphology, is proposed to classify ECG traces; finally, the performance of this classification method is evaluated.
Abstract. The assumption of separability of the covariance operator for a random image or hypersurface can be of substantial use in applications, especially in situations where the accurate estimation of the full covariance structure is unfeasible, either for computational reasons, or due to a small sample size. However, inferential tools to verify this assumption are somewhat lacking in high-dimensional or functional data analysis settings, where this assumption is most relevant. We propose here to test separability by focusing on K-dimensional projections of the difference between the covariance operator and a nonparametric separable approximation. The subspace we project onto is one generated by the eigenfunctions of the covariance operator estimated under the separability hypothesis, negating the need to ever estimate the full non-separable covariance. We show that the rescaled difference of the sample covariance operator with its separable approximation is asymptotically Gaussian. As a by-product of this result, we derive asymptotically pivotal tests under Gaussian assumptions, and propose bootstrap methods for approximating the distribution of the test statistics. We probe the finite sample performance through simulations studies, and present an application to log-spectrogram images from a phonetic linguistics dataset.
Dialect variation is of considerable interest in linguistics and other social sciences. However, traditionally it has been studied using proxies (transcriptions) rather than acoustic recordings directly. We introduce novel statistical techniques to analyse geolocalised speech recordings and to explore the spatial variation of pronunciations continuously over the region of interest, as opposed to traditional isoglosses, which provide a discrete partition of the region. Data of this type require an explicit modeling of the variation in the mean and the covariance. Usual Euclidean metrics are not appropriate, and we therefore introduce the concept of d-covariance, which allows consistent estimation both in space and at individual locations. We then propose spatial smoothing for these objects which accounts for the possibly non convex geometry of the domain of interest. We apply the proposed method to data from the spoken part of the British National Corpus, deposited at the British Library, London, and we produce maps of the dialect variation over Great Britain. In addition, the methods allow for acoustic reconstruction across the domain of interest, allowing researchers to listen to the statistical analysis.
The statistical analysis of data belonging to Riemannian manifolds is becoming increasingly important in many applications, such as shape analysis, diffusion tensor imaging and the analysis of covariance matrices. In many cases, data are spatially distributed but it is not trivial to take into account spatial dependence in the analysis because of the non linear geometry of the manifold. This work proposes a solution to the problem of spatial prediction for manifold valued data, with a particular focus on the case of positive definite symmetric matrices. Under the hypothesis that the dispersion of the observations on the manifold is not too large, data can be projected on a suitably chosen tangent space, where an additive model can be used to describe the relationship between response variable and covariates. Thus, we generalize classical kriging prediction, dealing with the spatial dependence in this tangent space, where well established Euclidean methods can be used. The proposed kriging prediction is applied to the matrix field of covariances between temperature and precipitation in Quebec, Canada
Summary The historical and geographical spread from older to more modern languages has long been studied by examining textual changes and in terms of changes in phonetic transcriptions. However, it is more difficult to analyse language change from an acoustic point of view, although this is usually the dominant mode of transmission. We propose a novel analysis approach for acoustic phonetic data, where the aim will be to model the acoustic properties of spoken words statistically. We explore phonetic variation and change by using a time–frequency representation, namely the log‐spectrograms of speech recordings. We identify time and frequency covariance functions as a feature of the language; in contrast, mean spectrograms depend mostly on the particular word that has been uttered. We build models for the mean and covariances (taking into account the restrictions placed on the statistical analysis of such objects) and use these to define a phonetic transformation that models how an individual speaker would sound in a different language, allowing the exploration of phonetic differences between languages. Finally, we map back these transformations to the domain of sound recordings, enabling us to listen to the output of the statistical analysis. The approach proposed is demonstrated by using recordings of the words corresponding to the numbers from 1 to 10 as pronounced by speakers from five different Romance languages.
In this paper, we generalize the metric-based permutation test for the equality of covariance operators proposed by Pigoli et al. (2014) to the case of multiple samples of functional data. To this end, the non-parametric combination methodology of Pesarin and Salmaso (2010) is used to combine all the pairwise comparisons between samples into a global test. Different combining functions and permutation strategies are reviewed and analyzed in detail. The resulting test allows to make inference on the equality of the covariance operators of multiple groups and, if there is evidence to reject the null hypothesis, to identify the pairs of groups having different covariances. It is shown that, for some combining functions, step-down adjusting procedures are available to control for the multiple testing problem in this setting. The empirical power of this new test is then explored via simulations and compared with those of existing alternative approaches in different scenarios. Finally, the proposed methodology is applied to data from wheel running activity experiments, that used selective breeding to study the evolution of locomotor behavior in mice. ). Most recently, much attention has been devoted to inferential procedures for covariance operators of functional data. Panaretos et al. (2010) examined the testing of equality of covariance structures from two groups of functional curves generated from Gaussian processes and Fremdt et al. (2013) extended their approach to the case of non Gaussian data. Both methods make use of test statistics based on the Karhunen-Loéve expansions of the covariance operators, thus exploiting the embedding of the space of covariance operators in the space of Hilbert-Schmidt operators, which is the infinite dimensional equivalent of embedding covariance matrices in the space of symmetric matrices. However, Pigoli et al. (2014) show that better results can be achieved by using metrics that take into account the non Euclidean geometry of the space of covariance operators. The drawback is that explicit analytic distributions are not available for the test statistics based on these metrics and therefore the authors proposed to use a permutation approach to carry out the test.The aim of this work is to extend this idea to the case of multiple samples of functional data. The testing of equality of several covariance operators has been first considered by Boente et al. (2014), that, in order to improve asymptotic approximations, proposed to apply a bootstrap procedure to calibrate the critical values of the test statistic obtained from the Hilbert-Schmidt norm of the differences between sample covariance operators. Paparoditis and Sapatinas (2016) investigated then the properties of an empirical bootstrap methodology, applicable to more than two populations, but its consistency has been proven only for test statistics based on the Hilbert-Schmidt norms and on the Karhunen-Loéve expansions of the covariance operators. More recently, Kashlak et al. (2016) applied concentration inequalities to the a...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.