Background Recently, pioneering expression quantitative trait loci (eQTL) studies on single cell RNA sequencing (scRNA-seq) data have revealed new and cell-specific regulatory single nucleotide variants (SNVs). Here, we present an alternative QTL-related approach applicable to transcribed SNV loci from scRNA-seq data: scReQTL. ScReQTL uses Variant Allele Fraction (VAFRNA) at expressed biallelic loci, and corelates it to gene expression from the corresponding cell. Results Our approach employs the advantage that, when estimated from multiple cells, VAFRNA can be used to assess effects of SNVs in a single sample or individual. In this setting scReQTL operates in the context of identical genotypes, where it is likely to capture RNA-mediated genetic interactions with cell-specific and transient effects. Applying scReQTL on scRNA-seq data generated on the 10 × Genomics Chromium platform using 26,640 mesenchymal cells derived from adipose tissue obtained from three healthy female donors, we identified 1272 unique scReQTLs. ScReQTLs common between individuals or cell types were consistent in terms of the directionality of the relationship and the effect size. Comparative assessment with eQTLs from bulk sequencing data showed that scReQTL analysis identifies a distinct set of SNV-gene correlations, that are substantially enriched in known gene-gene interactions and significant genome-wide association studies (GWAS) loci. Conclusion ScReQTL is relevant to the rapidly growing source of scRNA-seq data and can be applied to outline SNVs potentially contributing to cell type-specific and/or dynamic genetic interactions from an individual scRNA-seq dataset. Availability:https://github.com/HorvathLab/NGS/tree/master/scReQTL
With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10x Genomics platform. We include in the analysis 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), with an average sequencing reads over 120K/cell (more than 4 billion scRNA-seq reads total). High quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimate the expressed Variant Allele Fraction (VAFRNA) from SNV-aware alignments and analyze its variance and distribution (mono-and bi-allelic) at different cutoffs for required minimal number of sequencing reads. Our analysis shows that when assessing SNV loci covered by a minimum of 3 unique sequencing reads, over 50% of the heterozygous SNVs show biallelic expression, while at minimum of 10 reads, nearly 90% of the SNVs are bi-allelic. Consistent with single cell studies on RNA velocity and models of transcriptional burst kinetics, we observe a substantially higher rate of monoallelic expression among intronic SNVs, signifying the usefulness of scVAFRNA to assess dynamic cellular processes. Our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3'-based library generation protocol of 10x Genomics scRNA-seq data can be highly informative in SNV-based analyses.
Motivation By testing for associations between DNA genotypes and gene expression levels, expression quantitative trait locus (eQTL) analyses have been instrumental in understanding how thousands of single nucleotide variants (SNVs) may affect gene expression. As compared to DNA genotypes, RNA genetic variation represents a phenotypic trait that reflects the actual allele content of the studied system. RNA genetic variation at expressed SNV loci can be estimated using the proportion of alleles bearing the variant nucleotide (variant allele fraction, VAFRNA). VAFRNA is a continuous measure which allows for precise allele quantitation in loci where the RNA alleles do not scale with the genotype count. We describe a method to correlate VAFRNA with gene expression and assess its ability to identify genetically regulated expression solely from RNA-sequencing (RNA-seq) datasets. Results We introduce ReQTL, an eQTL modification which substitutes the DNA allele count for the variant allele fraction at expressed SNV loci in the transcriptome (VAFRNA). We exemplify the method on sets of RNA-seq data from human tissues obtained though the Genotype-Tissue Expression (GTEx) project and demonstrate that ReQTL analyses are computationally feasible and can identify a subset of expressed eQTL loci. Availability and implementation A toolkit to perform ReQTL analyses is available at https://github.com/HorvathLab/ReQTL. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.