This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of preprocessing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm -1 -14000cm -1 . Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.
A computer-aided multivariate water quality index is developed based on partial least squares (PLS) regression. The index is termed as the partial least squares water quality index (PLS-WQI). Briefly, a training set was computationally generated based on the guideline of National Water Quality Standards for Malaysia (NWQS) to predict the water quality. The index is benchmarked with the wellestablished index developed by the Department of Environment, Malaysia (DOE-WQI). The PLS-WQI is a continuous variable with the value closer to I indicating good water quality and closer to V indicating poor water quality. Unlike other conventional indexing methods, the algorithm calculates the index in a multivariate manner. The algorithm allows rapid processing of a large dataset without tedious calculation; it can be an efficient tool for spatial and temporal routine monitoring of water quality. Although the algorithm is designed based on the guideline of NWQS, it can be easily adapted to accommodate other guidelines. The algorithm was evaluated and demonstrated on the simulated and real datasets. Results indicate that the algorithm is robust and reliable. Based on six parameters, the overall ratings derived are inversely correlated to DOE-WQI. When the number of parameter is increased, the overall ratings appear to provide better insights into the water quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.