Motivation By testing for associations between DNA genotypes and gene expression levels, expression quantitative trait locus (eQTL) analyses have been instrumental in understanding how thousands of single nucleotide variants (SNVs) may affect gene expression. As compared to DNA genotypes, RNA genetic variation represents a phenotypic trait that reflects the actual allele content of the studied system. RNA genetic variation at expressed SNV loci can be estimated using the proportion of alleles bearing the variant nucleotide (variant allele fraction, VAFRNA). VAFRNA is a continuous measure which allows for precise allele quantitation in loci where the RNA alleles do not scale with the genotype count. We describe a method to correlate VAFRNA with gene expression and assess its ability to identify genetically regulated expression solely from RNA-sequencing (RNA-seq) datasets. Results We introduce ReQTL, an eQTL modification which substitutes the DNA allele count for the variant allele fraction at expressed SNV loci in the transcriptome (VAFRNA). We exemplify the method on sets of RNA-seq data from human tissues obtained though the Genotype-Tissue Expression (GTEx) project and demonstrate that ReQTL analyses are computationally feasible and can identify a subset of expressed eQTL loci. Availability and implementation A toolkit to perform ReQTL analyses is available at https://github.com/HorvathLab/ReQTL. Supplementary information Supplementary data are available at Bioinformatics online.
With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10x Genomics platform. We include in the analysis 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), with an average sequencing reads over 120K/cell (more than 4 billion scRNA-seq reads total). High quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimate the expressed Variant Allele Fraction (VAFRNA) from SNV-aware alignments and analyze its variance and distribution (mono-and bi-allelic) at different cutoffs for required minimal number of sequencing reads. Our analysis shows that when assessing SNV loci covered by a minimum of 3 unique sequencing reads, over 50% of the heterozygous SNVs show biallelic expression, while at minimum of 10 reads, nearly 90% of the SNVs are bi-allelic. Consistent with single cell studies on RNA velocity and models of transcriptional burst kinetics, we observe a substantially higher rate of monoallelic expression among intronic SNVs, signifying the usefulness of scVAFRNA to assess dynamic cellular processes. Our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3'-based library generation protocol of 10x Genomics scRNA-seq data can be highly informative in SNV-based analyses.
With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3′-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.
RsQTL is a tool for identification of splicing quantitative trait loci (sQTLs) from RNA-sequencing (RNA-seq) data by correlating the variant allele fraction at expressed SNV loci in the transcriptome (VAFRNA) with the proportion of molecules spanning local exon-exon junctions at loci with differential intron excision (percent spliced in, PSI). We exemplify the method on sets of RNA-seq data from human tissues obtained though the Genotype-Tissue Expression Project (GTEx). RsQTL does not require matched DNA and can identify a subset of expressed sQTL loci. Due to the dynamic nature of VAFRNA, RsQTL is applicable for the assessment of conditional and dynamic variation-splicing relationships.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.