Identifying genes that display spatial expression pattern in spatially resolved transcriptomic studies is an important first step towards characterizing the spatial transcriptomic landscape of complex tissues. Here, we developed a statistical method, SPARK, for identifying such spatially expressed genes in data generated from various spatially resolved transcriptomic techniques. SPARK directly models spatial count data through the generalized linear spatial models. It relies on newly developed statistical formulas for hypothesis testing, providing effective type I error control and yielding high statistical power. With a computationally efficient algorithm based on penalized quasi-likelihood, SPARK is also scalable to data sets with tens of thousands of genes measured on tens of thousands of samples. In four published spatially resolved transcriptomic data sets, we show that SPARK can be up to ten times more powerful than existing methods, revealing new biology in the data that otherwise cannot be revealed by existing approaches.
Spatial transcriptomic studies are becoming increasingly common and large, posing important statistical and computational challenges for many analytic tasks. Here, we present SPARK-X, a non-parametric method for rapid and effective detection of spatially expressed genes in large spatial transcriptomic studies. SPARK-X not only produces effective type I error control and high power but also brings orders of magnitude computational savings. We apply SPARK-X to analyze three large datasets, one of which is only analyzable by SPARK-X. In these data, SPARK-X identifies many spatially expressed genes including those that are spatially expressed within the same cell type, revealing new biological insights.
BackgroundDimensionality reduction is an indispensable analytic component for many areas of single-cell RNA sequencing (scRNA-seq) data analysis. Proper dimensionality reduction can allow for effective noise removal and facilitate many downstream analyses that include cell clustering and lineage reconstruction. Unfortunately, despite the critical importance of dimensionality reduction in scRNA-seq analysis and the vast number of dimensionality reduction methods developed for scRNA-seq studies, few comprehensive comparison studies have been performed to evaluate the effectiveness of different dimensionality reduction methods in scRNA-seq.ResultsWe aim to fill this critical knowledge gap by providing a comparative evaluation of a variety of commonly used dimensionality reduction methods for scRNA-seq studies. Specifically, we compare 18 different dimensionality reduction methods on 30 publicly available scRNA-seq datasets that cover a range of sequencing techniques and sample sizes. We evaluate the performance of different dimensionality reduction methods for neighborhood preserving in terms of their ability to recover features of the original expression matrix, and for cell clustering and lineage reconstruction in terms of their accuracy and robustness. We also evaluate the computational scalability of different dimensionality reduction methods by recording their computational cost.ConclusionsBased on the comprehensive evaluation results, we provide important guidelines for choosing dimensionality reduction methods for scRNA-seq data analysis. We also provide all analysis scripts used in the present study at www.xzlab.org/reproduce.html.
Summary Multi‐functional micro RNA s (mi RNA s) are emerging as key modulators of plant–pathogen interactions. Although the involvement of some mi RNA s in plant–insect interactions has been revealed, the underlying mechanisms are still elusive. The brown planthopper ( BPH ) is the most notorious rice ( Oryza sativa )‐specific insect that causes severe yield losses each year and requires urgent biological control. To reveal the mi RNA s involved in rice– BPH interactions, we performed mi RNA sequencing and identified BPH ‐responsive OsmiR396. Sequestering OsmiR396 by overexpressing target mimicry ( MIM 396) in three genetic backgrounds indicated that OsmiR396 negatively regulated BPH resistance. Overexpression of one BPH ‐responsive target gene of OsmiR396, growth regulating factor 8 ( Os GRF 8 ), showed resistance to BPH . Furthermore, the flavonoid contents increased in both the OsmiR396‐sequestered and the Os GRF 8 overexpressing plants. By analysing 39 natural rice varieties, the elevated flavonoid contents were found to correlate with enhanced BPH resistance. Artificial applications of flavonoids to wild type ( WT ) plants also increased resistance to BPH . A BPH ‐responsive flavanone 3‐hydroxylase ( OsF3H ) gene in the flavonoid biosynthetic pathway was proved to be directly regulated by Os GRF 8. A genetic functional analysis of OsF3H revealed its positive role in mediating both the flavonoid contents and BPH resistance. And analysis of the genetic correlation between OsmiR396 and OsF3H showed that down‐regulation of OsF3H complemented the BPH resistance characteristic and simultaneously decreased the flavonoid contents of the MIM 396 plants. Thus, we revealed a new BPH resistance mechanism mediated by the OsmiR396–Os GRF 8–OsF3H–flavonoid pathway. Our study suggests potential applications of mi RNA s in BPH resistance breeding.
Integrating results from genome-wide association studies (GWASs) and gene expression studies through transcriptome-wide association study (TWAS) has the potential to shed light on the causal molecular mechanisms underlying disease etiology. Here, we present a probabilistic Mendelian randomization (MR) method, PMR-Egger, for TWAS applications. PMR-Egger relies on a MR likelihood framework that unifies many existing TWAS and MR methods, accommodates multiple correlated instruments, tests the causal effect of gene on trait in the presence of horizontal pleiotropy, and is scalable to hundreds of thousands of individuals. In simulations, PMR-Egger provides calibrated type I error control for causal effect testing in the presence of horizontal pleiotropic effects, is reasonably robust under various types of model misspecifications, is more powerful than existing TWAS/MR approaches, and can directly test for horizontal pleiotropy. We illustrate the benefits of PMR-Egger in applications to 39 diseases and complex traits obtained from three GWASs including the UK Biobank.
Identifying differentially expressed (DE) genes from RNA sequencing (RNAseq) studies is among the most common analyses in genomics. However, RNAseq DE analysis presents several statistical and computational challenges, including over-dispersed read counts and, in some settings, sample non-independence. Previous count-based methods rely on simple hierarchical Poisson models (e.g. negative binomial) to model independent over-dispersion, but do not account for sample non-independence due to relatedness, population structure and/or hidden confounders. Here, we present a Poisson mixed model with two random effects terms that account for both independent over-dispersion and sample non-independence. We also develop a scalable sampling-based inference algorithm using a latent variable representation of the Poisson distribution. With simulations, we show that our method properly controls for type I error and is generally more powerful than other widely used approaches, except in small samples (n <15) with other unfavorable properties (e.g. small effect sizes). We also apply our method to three real datasets that contain related individuals, population stratification or hidden confounders. Our results show that our method increases power in all three data compared to other approaches, though the power gain is smallest in the smallest sample (n = 6). Our method is implemented in MACAU, freely available at www.xzlab.org/software.html.
Pseudomonas aeruginosa strain PA1201 is a newly identified rhizobacterium that produces high levels of the secondary metabolite phenazine-1-carboxylic acid (PCA), the newly registered biopesticide Shenqinmycin. PCA production in liquid batch cultures utilizing a specialized PCA-promoting medium (PPM) typically occurs after the period of most rapid growth, and production is regulated in a quorum sensing (QS)-dependent manner. PA1201 contains two PCA biosynthetic gene clusters phz1 and phz2; both clusters contribute to PCA production, with phz2 making a greater contribution. PA1201 also contains a complete set of genes for four QS systems (LasI/LasR, RhlI/RhlR, PQS/MvfR, and IQS). By using several methods including gene deletion, the construction of promoter-lacZ fusion reporter strains, and RNA-Seq analysis, this study investigated the effects of the four QS systems on bacterial growth, QS signal production, the expression of phz1 and phz2, and PCA production. The possible mechanisms for the strain- and condition-dependent expression of phz1 and phz2 were discussed, and a schematic model was proposed. These findings provide a basis for further genetic engineering of the QS systems to improve PCA production.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.