Eigenvalue analysis is an important aspect in many data modeling methods. Unfortunately, the eigenvalues of the sample covariance matrix (sample eigenvalues) are biased estimates of the eigenvalues of the covariance matrix of the data generating process (population eigenvalues). We present a new method based on bootstrapping to reduce the bias in the sample eigenvalues: the eigenvalue estimates are updated in several iterations, where in each iteration synthetic data is generated to determine how to update the population eigenvalue estimates. Comparison of the bootstrap eigenvalue correction with a state of the art correction method by Karoui shows that depending on the type of population eigenvalue distribution, sometimes the Karoui method performs better and sometimes our bootstrap method.
Verification decisions are often based on second order statistics estimated from a set of samples. Ongoing growth of computational resources allows for considering more and more features, increasing the dimensionality of the samples. If the dimensionality is of the same order as the number of samples used in the estimation or even higher, then the accuracy of the estimate decreases significantly. In particular, the eigenvalues of the covariance matrix are estimated with a bias and the estimate of the eigenvectors differ considerably from the real eigenvectors. We show how a classical approach of verification in high dimensions is severely affected by these problems, and we show how bias correction methods can reduce these problems.
Second-order statistics play an important role in data modeling. Nowadays, there is a tendency toward measuring more signals with higher resolution (e.g., high-resolution video), causing a rapid increase of dimensionality of the measured samples, while the number of samples remains more or less the same. As a result the eigenvalue estimates are significantly biased as described by the Marčenko Pastur equation for the limit of both the number of samples and their dimensionality going to infinity. By introducing a smoothness factor, we show that the Marčenko Pastur equation can be used in practical situations where both the number of samples and their dimensionality remain finite. Based on this result we derive methods, one already known and one new to our knowledge, to estimate the sample eigenvalues when the population eigenvalues are known. However, usually the sample eigenvalues are known and the population eigenvalues are required. We therefore applied one of the these methods in a feedback loop, resulting in an eigenvalue bias correction method. We compare this eigenvalue correction method with the state-of-the-art methods and show that our method outperforms other methods particularly in real-life situations often encountered in biometrics: underdetermined configurations, high-dimensional configurations, and configurations where the eigenvalues are exponentially distributed.
Eigenvalue estimation plays an important role in biometrics. However, if the number of samples is limited, estimates are significantly biased. In this article we analyse the influence of this bias on the error rates of PCA/LDA based verification systems, using both synthetic data with realistic parameters and real biometric data. Results of bias correction in the verification systems differ considerable between synthetic data and real data: while the bias is responsible for a large part of classification errors in the synthetic facial data, compensation of the bias in real facial data leads only to marginal improvements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.