Alternative polyadenylation (APA) has been shown to play an important role in gene expression regulation in animals and plants. However, the extent of sense and antisense APA at the genome level is not known. We developed a deep-sequencing protocol that queries the junctions of 3′UTR and poly(A) tails and confidently maps the poly(A) tags to the annotated genome. The results of this mapping show that 70% of Arabidopsis genes use more than one poly(A) site, excluding microheterogeneity. Analysis of the poly(A) tags reveal extensive APA in introns and coding sequences, results of which can significantly alter transcript sequences and their encoding proteins. Although the interplay of intron splicing and polyadenylation potentially defines poly(A) site uses in introns, the polyadenylation signals leading to the use of CDS protein-coding region poly(A) sites are distinct from the rest of the genome. Interestingly, a large number of poly(A) sites correspond to putative antisense transcripts that overlap with the promoter of the associated sense transcript, a mode previously demonstrated to regulate sense gene expression. Our results suggest that APA plays a far greater role in gene expression in plants than previously expected.alternative processing | antisense transcription | nonstop mRNAs
We provide convincing evidence for a novel breast cancer locus at the APOBEC3 genes. This CNV is one of the strongest common genetic risk variants identified so far for breast cancer.
We performed a genome scan at an average resolution of 8 cM in 719 Finnish sib pairs with type 2 diabetes. Our strongest results are for chromosome 20, where we observe a weighted maximum LOD score (MLS) of 2.15 at map position 69.5 cM from pter and secondary weighted LOD-score peaks of 2.04 at 56.5 cM and 1.99 at 17.5 cM. Our next largest MLS is for chromosome 11 (MLS = 1.75 at 84.0 cM), followed by chromosomes 2 (MLS = 0.87 at 5.5 cM), 10 (MLS = 0.77 at 75.0 cM), and 6 (MLS = 0.61 at 112.5 cM), all under an additive model. When we condition on chromosome 2 at 8.5 cM, the MLS for chromosome 20 increases to 5.50 at 69.0 cM (P=.0014). An ordered-subsets analysis based on families with high or low diabetes-related quantitative traits yielded results that support the possible existence of disease-predisposing genes on chromosomes 6 and 10. Genomewide linkage-disequilibrium analysis using microsatellite marker data revealed strong evidence of association for D22S423 (P=.00007). Further analyses are being carried out to confirm and to refine the location of these putative diabetes-predisposing genes.
Natural products (secondary metabolites) are a rich source of compounds with important biological activities. Eliciting pathway expression is always challenging but extremely important in natural product discovery because individual pathway is tightly controlled through unique regulation mechanism and hence often remains silent in the routine culturing conditions. To overcome the drawback of the traditional approaches that lack general applicability, we developed a simple synthetic biology approach that decouples pathway expression from complex native regulations. Briefly, the entire silent biosynthetic pathway is refactored using a plug-and-play scaffold and a set of heterologous promoters that are functional in a heterologous host under the target culturing condition. Using this strategy, we successfully awakened the silent spectinabilin pathway from Streptomyces orinoci. This strategy bypasses the traditional laborious processes to elicit pathway expression and represents a new platform for discovering novel natural products.
Genetic factors play an important role in the etiology of breast cancer. We carried out a multi-stage genome-wide association (GWA) study in over 28,000 cases and controls recruited from 12 studies conducted in Asian and European American women to identify genetic susceptibility loci for breast cancer. After analyzing 684,457 SNPs in 2,073 cases and 2,084 controls in Chinese women, we evaluated 53 SNPs for fast-track replication in an independent set of 4,425 cases and 1,915 controls of Chinese origin. Four replicated SNPs were further investigated in an independent set of 6,173 cases and 6,340 controls from seven other studies conducted in Asian women. SNP rs4784227 was consistently associated with breast cancer risk across all studies with adjusted odds ratios (95% confidence intervals) of 1.25 (1.20−1.31) per allele (P = 3.2×10−25) in the pooled analysis of samples from all Asian samples. This SNP was also associated with breast cancer risk among European Americans (per allele OR = 1.19, 95% CI = 1.09−1.31, P = 1.3×10−4, 2,797 cases and 2,662 controls). SNP rs4784227 is located at 16q12.1, a region identified previously for breast cancer risk among Europeans. The association of this SNP with breast cancer risk remained highly statistically significant in Asians after adjusting for previously-reported SNPs in this region. In vitro experiments using both luciferase reporter and electrophoretic mobility shift assays demonstrated functional significance of this SNP. These results provide strong evidence implicating rs4784227 as a functional causal variant for breast cancer in the locus 16q12.1 and demonstrate the utility of conducting genetic association studies in populations with different genetic architectures.
This large-scale epidemiological survey provides an estimate of the burden of rheumatic diseases in China.
Background: The red flour beetle Tribolium castaneum has emerged as an important model organism for the study of gene function in development and physiology, for ecological and evolutionary genomics, for pest control and a plethora of other topics. RNA interference (RNAi), transgenesis and genome editing are well established and the resources for genome-wide RNAi screening have become available in this model. All these techniques depend on a high quality genome assembly and precise gene models. However, the first version of the genome assembly was generated by Sanger sequencing, and with a small set of RNA sequence data limiting annotation quality. Results: Here, we present an improved genome assembly (Tcas5.2) and an enhanced genome annotation resulting in a new official gene set (OGS3) for Tribolium castaneum, which significantly increase the quality of the genomic resources. By adding large-distance jumping library DNA sequencing to join scaffolds and fill small gaps, the gaps in the genome assembly were reduced and the N50 increased to 4753kbp. The precision of the gene models was enhanced by the use of a large body of RNA-Seq reads of different life history stages and tissue types, leading to the discovery of 1452 novel gene sequences. We also added new features such as alternative splicing, well defined UTRs and microRNA target predictions. For quality control, 399 gene models were evaluated by manual inspection. The current gene set was submitted to Genbank and accepted as a RefSeq genome by NCBI. Conclusions: The new genome assembly (Tcas5.2) and the official gene set (OGS3) provide enhanced genomic resources for genetic work in Tribolium castaneum. The much improved information on transcription start sites supports transgenic and gene editing approaches. Further, novel types of information such as splice variants and microRNA target genes open additional possibilities for analysis.
To understand nuclear mRNA polyadenylation mechanisms in the model alga Chlamydomonas reinhardtii, we generated a data set of 16,952 in silico-verified poly(A) sites from EST sequencing traces based on Chlamydomonas Genome Assembly v.3.1. Analysis of this data set revealed a unique and complex polyadenylation signal profile that is setting Chlamydomonas apart from other organisms. In contrast to the high-AU content in the 39-UTRs of other organisms, Chlamydomonas shows a high-guanylate content that transits to high-cytidylate around the poly(A) site. The average length of the 39-UTR is 595 nucleotides (nt), significantly longer than that of Arabidopsis and rice. The dominant poly(A) signal, UGUAA, was found in 52% of the near-upstream elements, and its occurrence may be positively correlated with higher gene expression levels. The UGUAA signal also exists in Arabidopsis and in some mammalian genes but mainly in the far-upstream elements, suggesting a shift in function. The C-rich region after poly(A) sites with unique signal elements is a characteristic downstream element that is lacking in higher plants. We also found a high level of alternative polyadenylation in the Chlamydomonas genome, with a range of up to 33% of the 4057 genes analyzed having at least two unique poly(A) sites and $1% of these genes having poly(A) sites residing in predicted coding sequences, introns, and 59-UTRs. These potentially contribute to transcriptome diversity and gene expression regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.