Here, we confirm LBX1 as a susceptibility gene for idiopathic scoliosis in a Scandinavian population and report that we are unable to find evidence of other genes of similar or stronger effect.
RNA polymerase II (RNAPII) transcription converts the DNA sequence of a single gene into multiple transcript isoforms that may carry alternative functions. Gene isoforms result from variable transcription start sites (TSSs) at the beginning and polyadenylation sites (PASs) at the end of transcripts. How alternative TSSs relate to variable PASs is poorly understood. Here, we identify both ends of RNA molecules in Arabidopsis thaliana by transcription isoform sequencing (TIF-seq) and report four transcript isoforms per expressed gene. While intragenic initiation represents a large source of regulated isoform diversity, we observe that~14% of expressed genes generate relatively unstable short promoter-proximal RNAs (sppRNAs) from nascent transcript cleavage and polyadenylation shortly after initiation. The location of sppRNAs correlates with the position of promoter-proximal RNAPII stalling, indicating that large pools of promoter-stalled RNAPII may engage in transcriptional termination. We propose that promoter-proximal RNAPII stalling-linked to premature transcriptional termination may represent a checkpoint that governs plant gene expression.
Cryptic transcription is widespread and generates a heterogeneous group of RNA molecules of unknown function. To improve our understanding of cryptic transcription, we investigated their transcription start site (TSS) usage, chromatin organization, and posttranscriptional consequences in Saccharomyces cerevisiae. We show that TSSs of chromatin-sensitive internal cryptic transcripts retain comparable features of canonical TSSs in terms of DNA sequence, directionality, and chromatin accessibility. We define the 5′ and 3′ boundaries of cryptic transcripts and show that, contrary to RNA degradation–sensitive ones, they often overlap with the end of the gene, thereby using the canonical polyadenylation site, and associate to polyribosomes. We show that chromatin-sensitive cryptic transcripts can be recognized by ribosomes and may produce truncated polypeptides from downstream, in-frame start codons. Finally, we confirm the presence of the predicted polypeptides by reanalyzing N-terminal proteomic data sets. Our work suggests that a fraction of chromatin-sensitive internal cryptic promoters initiates the transcription of alternative truncated mRNA isoforms. The expression of these chromatin-sensitive isoforms is conserved from yeast to human, expanding the functional consequences of cryptic transcription and proteome complexity.
High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r = 0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r = 0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies.
A Swedish pedigree with an autosomal dominant inheritance of idiopathic scoliosis was initially studied by genetic linkage analysis, prioritising genomic regions for further analysis. This revealed a locus on chromosome 1 with a putative risk haplotype shared by all affected individuals. Two affected individuals were subsequently exome-sequenced, identifying a rare, non-synonymous variant in the CELSR2 gene. This variant is rs141489111, a c.G6859A change in exon 21 (NM_001408), leading to a predicted p.V2287I (NP_001399.1) change. This variant was found in all affected members of the pedigree, but showed reduced penetrance. Analysis of tagging variants in CELSR1-3 in a set of 1739 Swedish-Danish scoliosis cases and 1812 controls revealed significant association (p = 0.0001) to rs2281894, a common synonymous variant in CELSR2. This association was not replicated in case-control cohorts from Japan and the US. No association was found to variants in CELSR1 or CELSR3. Our findings suggest a rare variant in CELSR2 as causative for idiopathic scoliosis in a family with dominant segregation and further highlight common variation in CELSR2 in general susceptibility to idiopathic scoliosis in the Swedish-Danish population. Both variants are located in the highly conserved GAIN protein domain, which is necessary for the auto-proteolysis of CELSR2, suggesting its functional importance.
Pre-eclampsia is a common pregnancy disorder that is a major cause for maternal and perinatal mortality and morbidity. Variants predisposing to pre-eclampsia might be under negative evolutionary selection that is likely to keep their population frequencies low. We exome sequenced samples from a hundred Finnish pre-eclamptic women in pools of ten to screen for low-frequency, large-effect risk variants for pre-eclampsia. After filtering and additional genotyping steps, we selected 28 low-frequency missense, nonsense and splice site variants that were enriched in the pre-eclampsia pools compared to reference data, and genotyped the variants in 1353 pre-eclamptic and 699 non-pre-eclamptic women to test the association of them with pre-eclampsia and quantitative traits relevant for the disease. Genotypes from the SISu project (n = 6118 exome sequenced Finnish samples) were included in the binary trait association analysis as a population reference to increase statistical power. In these analyses, none of the variants tested reached genome-wide significance. In conclusion, the genetic risk for pre-eclampsia is likely complex even in a population isolate like Finland, and larger sample sizes will be necessary to detect risk variants.
The ribonucleolytic exosome complex is central for nuclear RNA degradation, primarily targeting non-coding RNAs. Still, the nuclear exosome could have protein-coding (pc) gene-specific regulatory activities. By depleting an exosome core component, or components of exosome adaptor complexes, we identify ∼2900 transcription start sites (TSSs) from within pc genes that produce exosome-sensitive transcripts. At least 1000 of these overlap with annotated mRNA TSSs and a considerable portion of their transcripts share the annotated mRNA 3′ end. We identify two types of pc-genes, both employing a single, annotated TSS across cells, but the first type primarily produces full-length, exosome-sensitive transcripts, whereas the second primarily produces prematurely terminated transcripts. Genes within the former type often belong to immediate early response transcription factors, while genes within the latter are likely transcribed as a consequence of their proximity to upstream TSSs on the opposite strand. Conversely, when genes have multiple active TSSs, alternative TSSs that produce exosome-sensitive transcripts typically do not contribute substantially to overall gene expression, and most such transcripts are prematurely terminated. Our results display a complex landscape of sense transcription within pc-genes and imply a direct role for nuclear RNA turnover in the regulation of a subset of pc-genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.