The first step in gene expression, transcription, is modulated by the interaction of transcription factors with their corresponding binding sites on the DNA sequence. Pscan is a software tool that scans a set of sequences (e.g. promoters) from co-regulated or co-expressed genes with motifs describing the binding specificity of known transcription factors and assesses which motifs are significantly over- or under-represented, providing thus hints on which transcription factors could be common regulators of the genes studied, together with the location of their candidate binding sites in the sequences. Pscan does not resort to comparisons with orthologous sequences and experimental results show that it compares favorably to other tools for the same task in terms of false positive predictions and computation time. The website is free and open to all users and there is no login requirement. Address: http://www.beaconlab.it/pscan.
Summary The SOX2 transcription factor is critical for neural stem cell (NSC) maintenance and brain development. Through chromatin immunoprecipitation (ChIP) and chromatin interaction analysis (ChIA-PET), we determined genome-wide SOX2-bound regions and Pol II-mediated long-range chromatin interactions in brain-derived NSCs. SOX2-bound DNA was highly enriched in distal chromatin regions interacting with promoters and carrying epigenetic enhancer marks. Sox2 deletion caused widespread reduction of Pol II-mediated long-range interactions and decreased gene expression. Genes showing reduced expression in Sox2 -deleted cells were significantly enriched in interactions between promoters and SOX2-bound distal enhancers. Expression of one such gene, Suppressor of Cytokine Signaling 3 ( Socs3 ), rescued the self-renewal defect of Sox2 -ablated NSCs. Our work identifies SOX2 as a major regulator of gene expression through connections to the enhancer network in NSCs. Through the definition of such a connectivity network, our study shows the way to the identification of genes and enhancers involved in NSC maintenance and neurodevelopmental disorders.
Various next generation sequencing (NGS) based strategies have been successfully used in the recent past for tracing origins and understanding the evolution of infectious agents, investigating the spread and transmission chains of outbreaks, as well as facilitating the development of effective and rapid molecular diagnostic tests and contributing to the hunt for treatments and vaccines. The ongoing COVID-19 pandemic poses one of the greatest global threats in modern history and has already caused severe social and economic costs. The development of efficient and rapid sequencing methods to reconstruct the genomic sequence of SARS-CoV-2, the etiological agent of COVID-19, has been fundamental for the design of diagnostic molecular tests and to devise effective measures and strategies to mitigate the diffusion of the pandemic. Diverse approaches and sequencing methods can, as testified by the number of available sequences, be applied to SARS-CoV-2 genomes. However, each technology and sequencing approach has its own advantages and limitations. In the current review, we will provide a brief, but hopefully comprehensive, account of currently available platforms and methodological approaches for the sequencing of SARS-CoV-2 genomes. We also present an outline of current repositories and databases that provide access to SARS-CoV-2 genomic data and associated metadata. Finally, we offer general advice and guidelines for the appropriate sharing and deposition of SARS-CoV-2 data and metadata, and suggest that more efficient and standardized integration of current and future SARS-CoV-2-related data would greatly facilitate the struggle against this new pathogen. We hope that our ‘ vademecum ’ for the production and handling of SARS-CoV-2-related sequencing data, will contribute to this objective.
The CCAAT box is an important promoter element regulated by NF-Y, a conserved trimer with histone-like features. We describe a new Position Specific Frequency Matrix (PSFM): we derived from 328 NF-Y promoters from the literature the p-CCAAT, and refined it by analysing ChIP on chip data (g-CCAAT). Interestingly, g-CCAAT has distinct features, such as variations within the CCAAT pentanucleotide. We validated the NF-Y-dependency of several promoters with functional assays. We examined the presence of these PSFMs in all human promoters and detail a number of parameters of CCAAT boxes: position, orientation, distance from TSS, presence of TATA, CpG islands and enrichments of nearby TF elements. The CCAAT genes fall into different GO categories, with cell cycle and chromatin/transcription specifically enriched. Additional findings surfaced: (1) the CCAAT-TATA combination, often mentioned in textbooks, is an exception, rather than the rule. CCAAT promoters are less precise in terms of TSS; (2) There is a good correlation between CCAAT and CpG islands; (3) selective TFs sites are enriched in CCAAT promoters, with precise stereoalignements of some of them. In conclusion, the new features of the CCAAT box and the link with the neighbouring elements will help in the functional classification of promoters.
BackgroundMADS-domain transcription factors play important roles during plant development. The Arabidopsis MADS-box gene SHORT VEGETATIVE PHASE (SVP) is a key regulator of two developmental phases. It functions as a repressor of the floral transition during the vegetative phase and later it contributes to the specification of floral meristems. How these distinct activities are conferred by a single transcription factor is unclear, but interactions with other MADS domain proteins which specify binding to different genomic regions is likely one mechanism.ResultsTo compare the genome-wide DNA binding profile of SVP during vegetative and reproductive development we performed ChIP-seq analyses. These ChIP-seq data were combined with tiling array expression analysis, induction experiments and qRT-PCR to identify biologically relevant binding sites. In addition, we compared genome-wide target genes of SVP with those published for the MADS domain transcription factors FLC and AP1, which interact with SVP during the vegetative and reproductive phases, respectively.ConclusionsOur analyses resulted in the identification of pathways that are regulated by SVP including those controlling meristem development during vegetative growth and flower development whereas floral transition pathways and hormonal signaling were regulated predominantly during the vegetative phase. Thus, SVP regulates many developmental pathways, some of which are common to both of its developmental roles whereas others are specific to only one of them.
Motif discovery has been one of the most widely studied problems in bioinformatics ever since genomic and protein sequences have been available. In particular, its application to the de novo prediction of putative over-represented transcription factor binding sites in nucleotide sequences has been, and still is, one of the most challenging flavors of the problem. Recently, novel experimental techniques like chromatin immunoprecipitation (ChIP) have been introduced, permitting the genome-wide identification of protein–DNA interactions. ChIP, applied to transcription factors and coupled with genome tiling arrays (ChIP on Chip) or next-generation sequencing technologies (ChIP-Seq) has opened new avenues in research, as well as posed new challenges to bioinformaticians developing algorithms and methods for motif discovery.
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.
A comprehensive knowledge of all the factors involved in splicing, both proteins and RNAs, and of their interaction network is crucial for reaching a better understanding of this process and its functions. A large part of relevant information is buried in the literature or collected in various different databases. By hand-curated screenings of literature and databases, we retrieved experimentally validated data on 71 human RNA-binding splicing regulatory proteins and organized them into a database called ‘SpliceAid-F’ (http://www.caspur.it/SpliceAidF/). For each splicing factor (SF), the database reports its functional domains, its protein and chemical interactors and its expression data. Furthermore, we collected experimentally validated RNA–SF interactions, including relevant information on the RNA-binding sites, such as the genes where these sites lie, their genomic coordinates, the splicing effects, the experimental procedures used, as well as the corresponding bibliographic references. We also collected information from experiments showing no RNA–SF binding, at least in the assayed conditions.In total, SpliceAid-F contains 4227 interactions, 2590 RNA-binding sites and 1141 ‘no-binding’ sites, including information on cellular contexts and conditions where binding was tested.The data collected in SpliceAid-F can provide significant information to explain an observed splicing pattern as well as the effect of mutations in functional regulatory elements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.