Tegillarca granosa samples contaminated artificially by three kinds of toxic heavy metals including zinc (Zn), cadmium (Cd), and lead (Pb) were attempted to be distinguished using laser-induced breakdown spectroscopy (LIBS) technology and pattern recognition methods in this study. The measured spectra were firstly processed by a wavelet transform algorithm (WTA), then the generated characteristic information was subsequently expressed by an information gain algorithm (IGA). As a result, 30 variables obtained were used as input variables for three classifiers: partial least square discriminant analysis (PLS-DA), support vector machine (SVM), and random forest (RF), among which the RF model exhibited the best performance, with 93.3% discrimination accuracy among those classifiers. Besides, the extracted characteristic information was used to reconstruct the original spectra by inverse WTA, and the corresponding attribution of the reconstructed spectra was then discussed. This work indicates that the healthy shellfish samples of Tegillarca granosa could be distinguished from the toxic heavy-metal-contaminated ones by pattern recognition analysis combined with LIBS technology, which only requires minimal pretreatments.
Background: Single-cell RNA-sequencing (scRNA-seq) is fast becoming a powerful tool for profiling genome-scale transcriptomes of individual cells and capturing transcriptome-wide cell-to-cell variability. However, scRNA-seq technologies suffer from high levels of technical noise and variability, hindering reliable quantification of lowly and moderately expressed genes. Since most downstream analyses on scRNA-seq, such as cell type clustering and differential expression analysis, rely on the gene-cell expression matrix, preprocessing of scRNA-seq data is a critical preliminary step in the analysis of scRNA-seq data.Results: We presented scNPF, an integrative scRNA-seq preprocessing framework assisted by network propagation and network fusion, for recovering gene expression loss, correcting gene expression measurements, and learning similarities between cells. scNPF leverages the context-specific topology inherent in the given data and the priori knowledge derived from publicly available molecular gene-gene interaction networks to augment gene-gene relationships in a data driven manner. We have demonstrated the great potential of scNPF in scRNA-seq preprocessing for accurately recovering gene expression values and learning cell similarity networks. Comprehensive evaluation of scNPF across a wide spectrum of scRNA-seq data sets showed that scNPF achieved comparable or higher performance than the competing approaches according to various metrics of internal validation and clustering accuracy. We have made scNPF an easy-to-use R package, which can be used as a versatile preprocessing plug-in for most existing scRNA-seq analysis pipelines or tools. Conclusions: scNPF is a universal tool for preprocessing of scRNA-seq data, which jointly incorporates the global topology of priori interaction networks and the context-specific information encapsulated in the scRNA-seq data to capture both shared and complementary knowledge from diverse data sources. scNPF could be used to recover gene signatures and learn cell-to-cell similarities from emerging scRNA-seq data to facilitate downstream analyses such as dimension reduction, cell type clustering, and visualization.
Motivation Single-cell RNA-sequencing (scRNA-seq) is fast and becoming a powerful technique for studying dynamic gene regulation at unprecedented resolution. However, scRNA-seq data suffer from problems of extremely high dropout rate and cell-to-cell variability, demanding new methods to recover gene expression loss. Despite the availability of various dropout imputation approaches for scRNA-seq, most studies focus on data with a medium or large number of cells, while few studies have explicitly investigated the differential performance across different sample sizes or the applicability of the approach on small or imbalanced data. It is imperative to develop new imputation approaches with higher generalizability for data with various sample sizes. Results We proposed a method called scHinter for imputing dropout events for scRNA-seq with special emphasis on data with limited sample size. scHinter incorporates a voting-based ensemble distance and leverages the synthetic minority oversampling technique for random interpolation. A hierarchical framework is also embedded in scHinter to increase the reliability of the imputation for small samples. We demonstrated the ability of scHinter to recover gene expression measurements across a wide spectrum of scRNA-seq datasets with varied sample sizes. We comprehensively examined the impact of sample size and cluster number on imputation. Comprehensive evaluation of scHinter across diverse scRNA-seq datasets with imbalanced or limited sample size showed that scHinter achieved higher and more robust performance than competing approaches, including MAGIC, scImpute, SAVER and netSmooth. Availability and implementation Freely available for download at https://github.com/BMILAB/scHinter. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.