The work summarized in this paper presents the first part of a three-paper series on robust partial least squares (RPLS) regression. Motivated by recent research activities in this area, this part provides a detailed algorithmic analysis of associated techniques, showing that existing work (i) may not represent a true robust formulation of partial least squares (PLS), (ii) may lead to convergence problems or (iii) may be insensitive to a certain type of outlier. On the basis of this analysis, Part I introduces a new conceptual RPLS algorithm that overcomes the deficiencies of existing work. The second part of this work details this new RPLS technique, compares its peformance with existing RPLS methods and provides an analysis on the computational efficiency and sensitivity of these algorithms. Whilst the first two parts of this work discuss algorithmic developments of RPLS, the final part concentrates on practical issues of RPLS implementations. This third part is devoted to practitioners of chemistry and chemical engineering covering a wide range of applications involving a calibration experiment, the analysis of recorded data from an industrial debutanizer process and data from a number of Raman spectroscopy experiments.
This paper is the third part of the work on robust partial least squares (RPLS) regression. The paper focuses on implementation issues for outlier detection and diagnosis. Furthermore, the paper introduces a numerically more efficient algorithm for determining the Stahel-Donoho estimator (SDE). This has been identified as a potential drawback of the new proposed RPLS algorithm, detailed in Part II of this work. Finally, a total of three application studies are presented which involve data recorded from (i) a calibration experiment (similar number of variables/ observations), (ii) a distillation process for purifying benzene (considerably more observations than variables) and (iii) an experiment of a multi-component concentration determination using Raman spectroscopy (considerably more variables than observations).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.