SUMMARYA fast PLS regression algorithm dealing with large data matrices with many variables (K) and fewer objects (N) is presented. For such data matrices the classical algorithm is computer-intensive and memory-demanding. Recently, Lindgren et al. (J. Chemometrics, 7,45-49 (1993)) developed a quick and efficient kernel algorithm for the case with many objects and few variables. The present paper is focused on the opposite case, i.e. many variables and fewer objects. A kernel algorithm is presented based on eigenvectors to the 'kernel' matrix XXTYYT, which is a square, non-symmetric matrix of size N x N, where N is the number of objects. Using the kernel matrix and the association matrices XX' (N x N) and YY' (N x N), it is possible to calculate all score and loading vectors and hence conduct a complete PLS regression including diagnostics such as R 2 . This is done without returning to the original data matrices X and Y. The algorithm is presented in equation form, with proofs of some new properties and as MATLAB code.
Systemic lupus erythematosus (SLE) is a chronic inflammatory autoimmune disease which can affect most organ systems including skin, joints and the kidney. Clinically, SLE is a heterogeneous disease and shares features of several other rheumatic diseases, in particular primary Sjögrens syndrome (pSS) and systemic sclerosis (SSc), why it is difficult to diagnose The pathogenesis of SLE is not completely understood, partly due to the heterogeneity of the disease. This study demonstrates that metabolomics can be used as a tool for improved diagnosis of SLE compared to other similar autoimmune diseases. We observed differences in metabolic profiles with a classification specificity above 67% in the comparison of SLE with pSS, SSc and a matched group of healthy individuals. Selected metabolites were also significantly different between studied diseases. Biochemical pathway analysis was conducted to gain understanding of underlying pathways involved in the SLE pathogenesis. We found an increased oxidative activity in SLE, supported by increased xanthine oxidase activity and an increased turnover in the urea cycle. The most discriminatory metabolite observed was tryptophan, with decreased levels in SLE patients compared to control groups. Changes of tryptophan levels were related to changes in the activity of the aromatic amino acid decarboxylase (AADC) and/or to activation of the kynurenine pathway.
SUMMARYA modified PLS algorithm is introduced with the goal of achieving improved prediction ability. The method, denoted IVS-PLS, is based on dimension-wise selective reweighting of single elements in the PLS weight vector w. Cross-validation, a criterion for the estimation of predictive quality, is used for guiding the selection procedure in the modelling stage. A threshold that controls the size of the selected values in w is put inside a cross-validation loop. This loop is repeated for each dimension and the results are interpreted graphically. The manipulation of w leads to rotation of the classical PLS solution. The results of IVS-PLS are different from simply selecting X-variables prior to modelling. The theory is explained and the algorithm is demonstrated for a simulated data set with 200 variables and 40 objects, representing a typical spectral calibration situation with four analytes. Improvements of up to 70% in external PRESS over the classical PLS algorithm are shown to be possible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.