The current spreading coronavirus SARS-CoV-2 is highly infectious and pathogenic. In this study, we screened the gene expression of three host receptors (ACE2, DC-SIGN and L-SIGN) of SARS coronaviruses and dendritic cells (DCs) status in bulk and single cell transcriptomic datasets of upper airway, lung or blood of COVID-19 patients and healthy controls. In COVID-19 patients, DC-SIGN gene expression was interestingly decreased in lung DCs but increased in blood DCs. Within DCs, conventional DCs (cDCs) were depleted while plasmacytoid DCs (pDCs) were augmented in the lungs of mild COVID-19. In severe cases, we identified augmented types of immature DCs (CD22+ or ANXA1+ DCs) with MHCII downregulation. In this study, our observation indicates that DCs in severe cases stimulate innate immune responses but fail to specifically present SARS-CoV-2. It provides insights into the profound modulation of DC function in severe COVID-19.
Gene expression in mammalian cells is inherently stochastic and mRNAs are synthesized in discrete bursts. Single-cell transcriptomics provides an unprecedented opportunity to explore the transcriptome-wide kinetics of transcriptional bursting. However, current analysis methods provide limited accuracy in bursting inference due to substantial noise inherent to single-cell transcriptomic data. In this study, we developed BISC, a Bayesian method for inferring bursting parameters from single cell transcriptomic data. Based on a beta-gamma-Poisson model, BISC modeled the mean–variance dependency to achieve accurate estimation of bursting parameters from noisy data. Evaluation based on both simulation and real intron sequential RNA fluorescence in situ hybridization data showed improved accuracy and reliability of BISC over existing methods, especially for genes with low expression values. Further application of BISC found bursting frequency but not bursting size was strongly associated with gene expression regulation. Moreover, our analysis provided new mechanistic insights into the functional role of enhancer and superenhancer by modulating both bursting frequency and size. BISC also formulated a downstream framework to identify differential bursting (in frequency and size separately) genes in samples under different conditions. Applying to multiple datasets (a mouse embryonic cell and fibroblast dataset, a human immune cell dataset and a human pancreatic cell dataset), BISC identified known cell-type signature genes that were missed by differential expression analysis, providing additional insights in understanding the cell-specific stochastic gene transcription. Applying to datasets of human lung and colon cancers, BISC successfully detected tumor signature genes based on alterations in bursting kinetics, which illustrates its value in understanding disease development regarding transcriptional bursting. Collectively, BISC provides a new tool for accurately inferring bursting kinetics and detecting differential bursting genes. This study also produced new insights in the role of transcriptional bursting in regulating gene expression, cell identity and tumor progression.
Motivation Recent advancements in single-cell RNA sequencing (scRNA-seq) have enabled time-efficient transcriptome profiling in individual cells. To optimize sequencing protocols and develop reliable analysis methods for various application scenarios, solid simulation methods for scRNA-seq data are required. However, due to the noisy nature of scRNA-seq data, currently available simulation methods cannot sufficiently capture and simulate important properties of real data, especially the biological variation. In this study, we developed SCRIP, a novel simulator for scRNA-seq that is accurate and enables simulation of bursting kinetics. Results Compared to existing simulators, SCRIP showed a significantly higher accuracy of stimulating key data features, including mean-variance dependency in all experiments. SCRIP also outperformed other methods in recovering cell-cell distances. The application of SCRIP in evaluating differential expression analysis methods showed that edgeR outperformed other examined methods in differential expression analyses, and ZINB-WaVE improved the AUC at high dropout rates. Collectively, this study provides the research community with a rigorous tool for scRNA-seq data simulation. Availability and implementation https://CRAN.R-project.org/package=SCRIP. Supplementary information Supplementary files are available at Bioinformatics online.
SummaryThe current spreading novel coronavirus SARS-CoV-2 is highly infectious and pathogenic. In this study, we screened the gene expression of three SARS-CoV-2 host receptors (ACE2, DC-SIGN and L-SIGN) and DC status in bulk and single cell transcriptomic datasets of upper airway, lung or blood of smokers, non-smokers and COVID-19 patients. We found smoking increased DC-SIGN gene expression and inhibited DC maturation and its ability of T cell stimulation. In COVID-19, DC-SIGN gene expression was interestingly decreased in lung DCs but increased in blood DCs. Strikingly, DCs shifted from cDCs to pDCs in COVID-19, but the shift was trapped in an immature stage (CD22+ or ANXA1+ DC) with MHCII downregulation in severe cases. This observation indicates that DCs in severe cases stimulate innate immune responses but fail to specifically recognize SARS-CoV-2. Our study provides insights into smoking effect on COVID-19 risk and the profound modulation of DC function in severe COVID-19.Graphical AbstractHighlightsSmoking upregulates the expression of ACE2 and CD209 and inhibits DC maturation in lungs. SARS-CoV-2 modulates the DCs proportion and CD209 expression differently in lung and blood.Severe infection is characterized by DCs less capable of maturation, antigen presentation and MHCII expression.DCs shift from cDCs to pDCs with SARS-CoV-2 infection but are trapped in an immature stage in severe cases.
Copy number variation has been identified as a major source of genomic variation associated with disease susceptibility. With the advent of whole-exome sequencing (WES) technology, massive WES data have been generated, allowing for the identification of copy number variants (CNVs) in the protein-coding regions with direct functional interpretation. We have previously shown evidence of the genomic correlation structure in array data and developed a novel chromosomal breakpoint detection algorithm, LDcnv, which showed significantly improved detection power through integrating the correlation structure in a systematic modeling manner. However, it remains unexplored whether the genomic correlation exists in WES data and how such correlation structure integration can improve the CNV detection accuracy. In this study, we first explored the correlation structure of the WES data using the 1000 Genomes Project data. Both real raw read depth and median-normalized data showed strong evidence of the correlation structure. Motivated by this fact, we proposed a correlation-based method, CORRseq, as a novel release of the LDcnv algorithm in profiling WES data. The performance of CORRseq was evaluated in extensive simulation studies and real data analysis from the 1000 Genomes Project. CORRseq outperformed the existing methods in detecting medium and large CNVs. In conclusion, it would be more advantageous to model genomic correlation structure in detecting relatively long CNVs. This study provides great insights for methodology development of CNV detection with NGS data.
Motivation Copy number variation plays important roles in human complex diseases. The detection of copy number variants (CNVs) is identifying mean shift in genetic intensities to locate chromosomal breakpoints, the step of which is referred to as chromosomal segmentation. Many segmentation algorithms have been developed with a strong assumption of independent observations in the genetic loci, and they assume each locus has an equal chance to be a breakpoint (i.e., boundary of CNVs). However, this assumption is violated in the genetics perspective due to the existence of correlation among genomic positions such as linkage disequilibrium (LD). Our study showed that the LD structure is related to the location distribution of CNVs which indeed presents a non-random pattern on the genome. To generate more accurate CNVs, we proposed a novel algorithm, LDcnv, that models the CNV data with its biological characteristics relating to genetic dependence structure (i.e., LD). Results We theoretically demonstrated the correlation structure of CNV data in SNP array, which further supports the necessity of integrating biological structure in statistical methods for CNV detection. Therefore, we developed the LDcnv that integrated the genomic correlation structure with a local search strategy into statistical modelling of the CNV intensities. To evaluate the performance of LDcnv, we conducted extensive simulations and analyzed large-scale HapMap datasets. We showed that LDcnv presented high accuracy, stability and robustness in CNV detection and higher precision in detecting short CNVs compared to existing methods. This new segmentation algorithm has a wide scope of potential application with data from various high-throughput technology platforms. Availability https://github.com/FeifeiXiaoUSC/LDcnv. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.