Partial Least Squares (PLS) is a standard statistical method in chemometrics. It can be considered as an incomplete, or "partial", version of the Least Squares estimator of regression, applicable when high or perfect multicollinearity is present in the predictor variables. The Least Squares estimator is well-known to be an optimal estimator for regression, but only when the error terms are normally distributed. In absence of normality, and in particular when outliers are in the data set, other more robust regression estimators have better properties. In this paper a "partial" version of M-regression estimators will be defined. If an appropriate weighting scheme is chosen, partial M-estimators become entirely robust to any type of outlying points. It is shown that robust M-regression outperforms existing methods for robust PLS regression in terms of statistical precision and computational speed, while keeping the robustness properties. The method is applied to a data set consisting of EPXMA spectra of archffiological glass vessels. This data set contains several outliers, and the advantages of Partial Robust M-regression are illustrated. Applying Partial Robust M-regression yields much smaller prediction errors for noisy calibration samples than PLS. On the other hand, if the data follow perfectly well a normal model, the loss in efficiency to be paid for is very small.
portant analytical technique. Meanwhile, AIPA, in conjunction with statistical analysis programs for data handling and interpretation, is proving to be a powerful technique for studying atmospheric particle chemistry.
ACKNOWLEDGMENTWe thank Michael Palma for his help with sample preparation.
The spatial sign is a multivariate extension of the concept of sign. Recently multivariate estimators of covariance structures based on spatial signs have been examined by various authors. These new estimators are found to be robust to outlying observations. From a computational point of view, estimators based on spatial sign are very easy to implement as they boil down to a transformation of the data to their spatial signs, from which the classical estimator is then computed. Hence, one can also consider the transformation to spatial signs to be a preprocessing technique, which ensures that the calibration procedure as a whole is robust. In this paper, we examine the special case of spatial sign preprocessing in combination with partial least squares regression as the latter technique is frequently applied in the context of chemical data analysis. In a simulation study, we compare the performance of the spatial sign transformation to nontransformed data as well as to two robust counterparts of partial least squares regression. It turns out that the spatial sign transform is fairly efficient but has some undesirable bias properties. The method is applied to a recently published data set in the field of quantitative structure-activity relationships, where it is seen to perform equally well as the previously described best linear model for these data.
Several algorithms to calculate the vector of regression coefficients and the Jacobian matrix for partial least squares regression have been published. Whereas many efficient algorithms to calculate the regression coefficients exist, algorithms to calculate the Jacobian matrix are inefficient. Here we introduce a new, efficient algorithm for the Jacobian matrix, thus making the calculation of prediction intervals via a local linearization of the PLS estimator more practicable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.