Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
RNA sequencing is an increasingly popular technology for genome-wide analysis of transcript sequence and abundance. However, understanding of the sources of technical and interlaboratory variation is still limited. To address this, the GEUVADIS consortium sequenced mRNAs and small RNAs of lymphoblastoid cell lines of 465 individuals in seven sequencing centers, with a large number of replicates. The variation between laboratories appeared to be considerably smaller than the already limited biological variation. Laboratory effects were mainly seen in differences in insert size and GC content and could be adequately corrected for. In small-RNA sequencing, the microRNA (miRNA) content differed widely between samples owing to competitive sequencing of rRNA fragments. This did not affect relative quantification of miRNAs. We conclude that distributing RNA sequencing among different laboratories is feasible, given proper standardization and randomization procedures. We provide a set of quality measures and guidelines for assessing technical biases in RNA-seq data.
Spondyloarthritis encompasses a group of common inflammatory diseases thought to be driven by IL-17A-secreting type-17 lymphocytes. Here we show increased numbers of GM-CSF-producing CD4 and CD8 lymphocytes in the blood and joints of patients with spondyloarthritis, and increased numbers of IL-17A+GM-CSF+ double-producing CD4, CD8, γδ and NK cells. GM-CSF production in CD4 T cells occurs both independently and in combination with classical Th1 and Th17 cytokines. Type 3 innate lymphoid cells producing predominantly GM-CSF are expanded in synovial tissues from patients with spondyloarthritis. GM-CSF+CD4+ cells, isolated using a triple cytokine capture approach, have a specific transcriptional signature. Both GM-CSF+ and IL-17A+GM-CSF+ double-producing CD4 T cells express increased levels of GPR65, a proton-sensing receptor associated with spondyloarthritis in genome-wide association studies and pathogenicity in murine inflammatory disease models. Silencing GPR65 in primary CD4 T cells reduces GM-CSF production. GM-CSF and GPR65 may thus serve as targets for therapeutic intervention of spondyloarthritis.
The dystrophin protein encoding DMD gene is the longest human gene. The 2.2 Mb long human dystrophin transcript takes 16 hours to be transcribed and is co-transcriptionally spliced. It contains long introns (24 over 10kb long, 5 over 100kb long) and the heterogeneity in intron size makes it an ideal transcript to study different aspects of the human splicing process. Splicing is a complex process and much is unknown regarding the splicing of long introns in human genes.Here, we used ultra-deep transcript sequencing to characterize splicing of the dystrophin transcripts in 3 different human skeletal muscle cell lines, and explored the order of intron removal and multi-step splicing. Coverage and read pair analyses showed that around 40% of the introns were not always removed sequentially. Additionally, for the first time, we report that non-consecutive intron removal resulted in 3 or more joined exons which are flanked by unspliced introns and we defined these joined exons as an exon block. Lastly, computational and experimental data revealed that, for the majority of dystrophin introns, multistep splicing events are used to splice out a single intron.Overall, our data show for the first time in a human transcript, that multi-step intron removal is a general feature of mRNA splicing.
AS Th17 cells have a specific miR signature and upregulate miR-10b in vitro. Our data suggest that miR-10b is upregulated by proinflammatory cytokines and may act as a feedback loop to suppress IL-17A by targeting MAP3K7. miR-10b is a potential therapeutic candidate to suppress pathogenic Th17 cell function in patients with AS.
We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-014-0555-3) contains supplementary material, which is available to authorized users.
Objective Multiple single‐nucleotide polymorphisms ( SNP s) conferring susceptibility to osteoarthritis ( OA ) mark imbalanced expression of positional genes in articular cartilage, reflected by unequally expressed alleles among heterozygotes (allelic imbalance [ AI ]). We undertook this study to explore the articular cartilage transcriptome from OA patients for AI events to identify putative disease‐driving genetic variation. Methods AI was assessed in 42 preserved and 5 lesioned OA cartilage samples (from the Research Arthritis and Articular Cartilage study) for which RNA sequencing data were available. The count fraction of the alternative alleles among the alternative and reference alleles together ( φ ) was determined for heterozygous individuals. A meta‐analysis was performed to generate a meta‐ φ and P value for each SNP with a false discovery rate ( FDR ) correction for multiple comparisons. To further validate AI events, we explored them as a function of multiple additional OA features. Results We observed a total of 2,070 SNP s that consistently marked AI of 1,031 unique genes in articular cartilage. Of these genes, 49 were found to be significantly differentially expressed (fold change <0.5 or >2, FDR <0.05) between preserved and paired lesioned cartilage, and 18 had previously been reported to confer susceptibility to OA and/or related phenotypes. Moreover, we identified notable highly significant AI SNP s in the CRLF 1 , WWP 2 , and RPS 3 genes that were related to multiple OA features. Conclusion We present a framework and resulting data set for researchers in the OA research field to probe for disease‐relevant genetic variation that affects gene expression in pivotal disease‐affected tissue. This likely includes putative novel compelling OA risk genes such as CRLF 1 , WWP 2 , and RPS 3 .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.