SummaryProtein-RNA interactions play critical roles in all aspects of gene expression. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova2 revealed extremely reproducible RNA binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3′ UTRs, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.
Nova proteins are a neuron-specific alternative splicing factors. We have combined bioinformatics, biochemistry and genetics to derive an RNA map describing the rules by which Nova proteins regulate alternative splicing. This map revealed that the position of Nova binding sites (YCAY clusters) in a pre-messenger RNA determines the outcome of splicing. The map correctly predicted Nova's effect to inhibit or enhance exon inclusion, which led us to examine the relationship between the map and Nova's mechanism of action. Nova binding to an exonic YCAY cluster changed the protein complexes assembled on pre-mRNA, blocking U1 snRNP (small nuclear ribonucleoprotein) binding and exon inclusion, whereas Nova binding to an intronic YCAY cluster enhanced spliceosome assembly and exon inclusion. Assays of splicing intermediates of Nova-regulated transcripts in mouse brain revealed that Nova preferentially regulates removal of introns harbouring (or closest to) YCAY clusters. These results define a genome-wide map relating the position of a cis-acting element to its regulation by an RNA binding protein, namely that Nova binding to YCAY clusters results in a local and asymmetric action to regulate spliceosome assembly and alternative splicing in neurons.
Unusually for a eukaryote, genes transcribed by RNA polymerase II (pol II) in Trypanosoma brucei are arranged in polycistronic transcription units. With one exception, no pol II promoter motifs have been identified, and how transcription is initiated remains an enigma. T. brucei has four histone variants: H2AZ, H2BV, H3V, and H4V. Using chromatin immunoprecipitation (ChIP) and sequencing (ChIP-seq) to examine the genome-wide distribution of chromatin components, we show that histones H4K10ac, H2AZ, H2BV, and the bromodomain factor BDF3 are enriched up to 300-fold at probable pol II transcription start sites (TSSs). We also show that nucleosomes containing H2AZ and H2BV are less stable than canonical nucleosomes. Our analysis also identifies >60 unexpected TSS candidates and reveals the presence of long guanine runs at probable TSSs. Apparently unique to trypanosomes, additional histone variants H3V and H4V are enriched at probable pol II transcription termination sites. Our findings suggest that histone modifications and histone variants play crucial roles in transcription initiation and termination in trypanosomes and that destabilization of nucleosomes by histone variants is an evolutionarily ancient and general mechanism of transcription initiation, demonstrated in an organism in which general pol II transcription factors have been elusive. The protozoan parasite Trypanosoma brucei branched early in eukaryotic evolution and is probably the most divergent well-studied eukaryote. Many discoveries of general interest have been made in T. brucei, emphasizing its value for understanding the evolution of core molecular processes.Transcription of protein-coding genes by RNA polymerase II (pol II) in T. brucei differs in two important aspects from most other eukaryotes. First, transcription is polycistronic: Arrays of sometimes >100 genes are transcribed in polycistronic transcription units (PTUs). This organization is reminiscent of operons in prokaryotes except that there is no evidence to suggest clustering of functionally related genes in T. brucei. Second, mRNAs are separated post-transcriptionally by coupled splicing and polyadenylation reactions that add a 39-nucleotide (nt) ''spliced-leader'' to every mRNA (for review, see Liang et al. 2003). Within a PTU, genes are transcribed from the same strand, but transcription of two neighboring PTUs can either be convergent or divergent. The regions between PTUs are referred to as strand switch regions (SSRs). Strand-specific nuclear run-on assays performed in Leishmania (Martinez-Calvillo et al. 2003), a genus related to T. brucei, have shown that pol II transcription starts at SSRs between two transcriptionally divergent PTUs (divergent SSRs) and ends at SSRs between two transcriptionally convergent PTUs (convergent SSRs). Because 75% of all Leishmania major genes can be found in the same genomic context in T. brucei (El-Sayed et al. 2005), indicating a high degree of synteny, it is reasonable to hypothesize that divergent SSRs in T. brucei are transcription start s...
Transcription of protein-coding genes in trypanosomes is polycistronic and gene expression is primarily regulated by post-transcriptional mechanisms. Sequence motifs in the untranslated regions regulate mRNA trans-splicing and RNA stability, yet where UTRs begin and end is known for very few genes. We used high-throughput RNA-sequencing to determine the genome-wide steady-state mRNA levels (‘transcriptomes’) for ∼90% of the genome in two stages of the Trypanosoma brucei life cycle cultured in vitro. Almost 6% of genes were differentially expressed between the two life-cycle stages. We identified 5′ splice-acceptor sites (SAS) and polyadenylation sites (PAS) for 6959 and 5948 genes, respectively. Most genes have between one and three alternative SAS, but PAS are more dispersed. For 488 genes, SAS were identified downstream of the originally assigned initiator ATG, so a subsequent in-frame ATG presumably designates the start of the true coding sequence. In some cases, alternative SAS would give rise to mRNAs encoding proteins with different N-terminal sequences. We could identify the introns in two genes known to contain them, but found no additional genes with introns. Our study demonstrates the usefulness of the RNA-seq technology to study the transcriptional landscape of an organism whose genome has not been fully annotated.
michael.man@pfizer.com
PurposeColorectal cancer (CRC) is one of the most common malignant tumors worldwide. This study aimed to explore the prognostic value of lncRNAs in CRC.Material and methodsWe performed gene expression profiling to identify differentially expressed lncRNAs between 51 normal and 646 tumor tissues from The Cancer Genome Atlas database. Cox regression and robust likelihood-based survival models were used to find prognosis-related lncRNAs. A lncRNA signature was developed to predict the overall survival of patients with CRC. In addition, a receiver operating characteristic curve analysis was performed to identify the optimal cutoff with the best Youden index to divide patients into different groups based on risk level.ResultsEighty survival-related lncRNAs were identified and a 15-lncRNA signature was developed on the basis of a risk score to comprehensively predict the overall survival of patients with CRC. The prognostic value of the 15-lncRNA risk score was validated using the internal testing set and total set. The risk indicator was shown to be an independent prognostic factor (hazard ratio =2.92; 95% CI: 1.73–4.94; P<0.001). Notably, all 15 lncRNAs (AC024581.1, FOXD3-AS1, AC012531.1, AC003101.2, LINC01219, AC083967.1, AL590483.1, AC105118.1, AC010789.1, AC067930.5, AC105219.2, LINC01354, LINC02474, LINC02257, and AC079612.1) were newly found to correlate with the prognosis of patients with CRC. Furthermore, the function of 15 lncRNAs was explored through the ceRNA network. These lncRNAs regulated coding genes that were involved in many key cancer pathways.ConclusionA 15-lncRNA expression signature was discovered as a prognostic indicator for patients with CRC, which may act as competing endogenous RNA (ceRNAs) to play a crucial role in the modulation of cancer-related pathways. These findings may allow a better understanding of the prognostic value of lncRNAs.
The role of C-X-C motif chemokine 10 (CXCL10), a pro-inflammatory factor, in the development of acute respiratory distress syndrome (ARDS) remains unclear. In this study, we explored the role of CXCL10 and the effect of CXCL10 neutralization in lipopolysaccharide (LPS)-induced ARDS in rats. The expression of CXCL10 and its receptor chemokine receptor 3(CXCR3) increased after LPS induction. Moreover, neutralization of CXCL10 ameliorated the severity of ARDS by reducing pulmonary edema, inhibiting the release of inflammatory mediators (IFN-γ, IL-6 and ICAM-1) and limiting inflammatory cells (neutrophils, macrophages, CD8+ T cells) influx into the lung, with a reduction in CXCR3 expression in neutrophils and macrophages. Therefore, CXCL10 could be a potential therapeutic target in LPS-induced ARDS.
We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs in protein sequences. The algorithm assumes that a motif is constituted by the presence of a "good" combination of residues in appropriate locations of the motif. The algorithm attempts to compile such good combinations into a "pattern dictionary" by processing an aligned training set of protein sequences. The dictionary is subsequently used to detect motifs in new protein sequences. Statistical significance of the detection results are ensured by statistically determining the various parameters of the algorithm. Based on this approach, we have implemented a program called GYM. The Helix-Turn-Helix motif was used as a model system on which to test our program. The program was also extended to detect Homeodomain motifs. The detection results for the two motifs compare favorably with existing programs. In addition, the GYM program provides a lot of useful information about a given protein sequence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.