Near-infrared (NIR) and mid-IR spectroscopy applied to soil compositional analysis started to develop markedly in the 1990s, taking advantage of earlier advances in instrumentation and chemometrics for agricultural products. Today, NIR spectroscopy is envisioned as replacing laboratory analysis in certain applications (e.g., soil-carbon-credit assessment at the farm level). However, accuracy is still unsatisfactory compared with standard laboratory procedures, leading some authors to think that such a challenge will never be met. This article investigates the critical points to be aware of when accuracy of NIR-based measurements is assessed. First is the decomposition of the standard error of prediction into components of bias and variance, only the latter being reducible by averaging. This decomposition is not used routinely in the soil-science literature. Contrarily, a log-normal distribution of reference values is very often encountered with soil samples [e.g., elemental concentrations (e.g., carbon)] with numerous small or zero values. These very skewed distributions make us take precautions when using inverse regression methods (e.g., principal component regression or partial least squares), which force the predictions towards the centre of the calibration set, leading to negative effects on the standard error prediction-and therefore on prediction accuracy-especially when log-normal distributions are encountered. Such distributions, which are very common for soil components, also make the ratio of performance to deviation a useless, even hazardous, tool, leading to erroneous conclusions. We propose a new index based on the quartiles of the empirical distribution-ratio of performance to inter-quartile distance-to overcome this problem.
Nowadays, near infrared (NIR)technology is being transferred from the laboratory to the industrial world for on-line and portable applications. As a result, new issues are arising, such as the need for increased robustness, or the ability to compensate for non-linearities in the calibration or instrument. Semi-parametric modeling has been suggested as a means for adapting to these complications. In this article, Least-Squared Support Vector Machine (LS-SVM) regression, a semi-parametric modeling technique, is used to predict the acidity of three different grape varieties using NIR spectra. The performance and robustness of LS-SVM regression are compared to Partial Least Square Regression (PLSR) and Multivariate Linear Regression (MLR). LS-SVM regression produces more accurate prediction. However SNV pretreatment is required to improve the model robustness.NIR Spectroscopy Robust calibration LS-SVM PLSR MLR Grapes tartaric and malic acidity.
To cite this version:J.M. Roger, F. Chauchard, V. Bellon Maurel. EPO-PLS external parameter orthogonalisation of PLS application to temperature-independent measurement of sugar content of intact fruits. Chemometrics and Intelligent Laboratory Systems, Elsevier, 2003, 66 (2), p. 191 -p. 204
AbstractNIR spectrometry would present a high potential for online measurement if the robustness of multivariate calibration was improved. The lack of robustness notably appears when an external parameter varies -e.g. the product temperature. This paper presents a preprocessing method which aims at removing from the X space the part mostly influenced by the external parameter variations. This method estimates this parasitic subspace by computing a PCA on a small set of spectra measured on the same objects, while the external parameter is varying. An application to the influence of the fruit temperature on the sugar content measurement of intact apples is presented. Without any preprocessing, the bias in the sugar content prediction was about 8 o Brix for a temperature variation of 20 o C. After EPO preprocessing the bias is not more than 0.3 o Brix, for the same temperature range. The parasitic subspace is studied by analysing the b-coefficient of a PLS between the temperature and the influence spectra. Further work will be achieved to apply this method to the case of multiple external parameters and to the calibration transfer issue.
Field measurement using NIR spectroscopy is becoming a popular method to provide in situ, rapid, and inexpensive estimation of soil organic carbon (SOC) content. However NIR reflectance is quite sensitive to external environmental conditions, such as temperature and soil moisture. In the field, the soil moisture content can be highly variable. It is a challenge to find a chemometric method that allows for prediction of soil organic carbon from spectra obtained under field conditions that is insensitive to variable moisture content. This paper utilises an external parameter orthogonalisation (EPO) algorithm to remove the effect of soil moisture from NIR spectra for the calibration of SOC content. The algorithm projects all the soil spectra orthogonal to the space of unwanted variation, and thus the variations of soil moisture can be effectively removed. We designed a protocol with 3 independent datasets to be used for calibration of NIR spectra: (1) the calibration dataset, which contains soil samples with measured spectra and SOC content under standard (or laboratory) condition (air-dried), (2) the EPO development dataset contains spectra under laboratory condition (air-dried samples) and spectra collected under field conditions (varying soil moisture content), and (3) the validation dataset contains spectra collected under field condition and measured SOC content. We conducted experiments using soils at different moisture contents in laboratory conditions. Using the EPO algorithm, we are able to remove the effect of soil moisture from the spectra, which resulted in improved calibration and prediction of SOC content
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.