Rationale
Electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI FT‐ICR MS) is an important analytical technique used for the elucidation of crude oil polar compounds at the molecular level, providing thousands of heteroatom compounds in a single analysis. Due to the high resolution, the complexity of data produced, and steps involved in spectra acquisition and processing, it is necessary to estimate its intermediate precision.
Methods
Intermediate precision was estimated for positive‐ and negative‐ion ionization modes (ESI(±)) using Composer® software for two Brazilian crude oil samples. The analytical parameters evaluated were the class distribution histogram, the double bond equivalent (DBE) distribution, and the DBE versus carbon number. The statistical parameters used to study the intermediate precision were calculated from the average, standard deviation, confidence interval (significance level at 5%), coefficient of variation (CV), intermediate precision limit (ISO 5725), and principal component analysis (PCA).
Results
Two crude oil samples (A and B) were analyzed, in triplicate, for seven consecutive days by ESI(±) FT‐ICR MS. The assigned class limit by ESI(+) for crude oil A was 0.42% (O2S[H] class) and for crude oil B was 0.04% (N2O2S[H] class). The assigned DBE intensity limits for the two crude oils were 0.04% for ESI(+) and 0.013% for ESI(−). The PCA for ESI(−) and ESI(+) modes presented better precision for crude oils B and A, respectively.
Conclusions
The most abundant classes and DBE of the majority class (i.e., with the highest intensity) are the parameters produced from the Composer® software that had the highest precision and can be used to estimate crude oil properties. The DBE values presented lower intermediate precision limit values (0.04%) than the assigned class values (0.4%). According to CV and PCA, ESI(+) was more precise for crude oil A (83% precision) and ESI(−) for crude oil B (84% precision).
Severe
acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has
caused the worst global health crisis in living memory. The reverse
transcription polymerase chain reaction (RT-qPCR) is considered the
gold standard diagnostic method, but it exhibits limitations in the
face of enormous demands. We evaluated a mid-infrared (MIR) data set
of 237 saliva samples obtained from symptomatic patients (138 COVID-19
infections diagnosed via RT-qPCR). MIR spectra were evaluated via
unsupervised random forest (URF) and classification models. Linear
discriminant analysis (LDA) was applied following the genetic algorithm
(GA-LDA), successive projection algorithm (SPA-LDA), partial least
squares (PLS-DA), and a combination of dimension reduction and variable
selection methods by particle swarm optimization (PSO-PLS-DA). Additionally,
a consensus class was used. URF models can identify structures even
in highly complex data. Individual models performed well, but the
consensus class improved the validation performance to 85% accuracy,
93% sensitivity, 83% specificity, and a Matthew’s correlation
coefficient value of 0.69, with information at different spectral
regions. Therefore, through this unsupervised and supervised framework
methodology, it is possible to better highlight the spectral regions
associated with positive samples, including lipid (∼1700 cm
–1
), protein (∼1400 cm
–1
),
and nucleic acid (∼1200–950 cm
–1
)
regions. This methodology presents an important tool for a fast, noninvasive
diagnostic technique, reducing costs and allowing for risk reduction
strategies.
Here, we combine angular search algorithm and variance inflation factor (ASA-VIF) with support vector regression (SVR) (ASA-VIF-SVR) to estimate total acid number (TAN), basic nitrogen content (BNC), and sulfur content (SC) in Brazilian crude oils. To prevent the interference of outliers, we further developed a strategy for outlier identification and applied it to nonlinear models based on RMSE (root mean square error). ASA-VIF-SVR was applied to near-and mid-infrared spectroscopy (NIR and MIR) and hydrogen nuclear magnetic resonance (1 H NMR) spectroscopy data available in a range of 93-194 samples. The models were evaluated for accuracy (root mean square error of calibration [RMSEC] and root mean square error of prediction [RMSEP]) and linearity (coefficient of determination, R 2). The removal of outliers increased accuracy and linearity of our models. The ASA-VIF model for TAN, BNC, and SC selected 0.37%, 0.93%, and 0.30% of variables from full NIR spectra; 0.21%, 0.27%, and 0.21% from full MIR; and 0.20%, 0.42%, and 0.15% from full 1 H NMR. In most cases, the best results were obtained with variable selection compared with the full dataset. Also, 1 H NMR generated more accurate and linear models with RMSEP and R 2 p of 0.0071 wt% and 0.86 for BNC and 0.0623 wt% and 0.79 for SC. TAN showed a better MIR result with RMSEP of 0.1426 mg KOH g-1 and R 2 p of 0.47. The most important region for 1 H NMR and MIR was the one with the largest quantity of unpaired electrons (aromatic region).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.