BackgroundTranscription factors (TFs) are important regulatory proteins that govern transcriptional regulation. Today, it is known that in higher organisms different TFs have to cooperate rather than acting individually in order to control complex genetic programs. The identification of these interactions is an important challenge for understanding the molecular mechanisms of regulating biological processes. In this study, we present a new method based on pointwise mutual information, PC-TraFF, which considers the genome as a document, the sequences as sentences, and TF binding sites (TFBSs) as words to identify interacting TFs in a set of sequences.ResultsTo demonstrate the effectiveness of PC-TraFF, we performed a genome-wide analysis and a breast cancer-associated sequence set analysis for protein coding and miRNA genes. Our results show that in any of these sequence sets, PC-TraFF is able to identify important interacting TF pairs, for most of which we found support by previously published experimental results. Further, we made a pairwise comparison between PC-TraFF and three conventional methods. The outcome of this comparison study strongly suggests that all these methods focus on different important aspects of interaction between TFs and thus the pairwise overlap between any of them is only marginal.ConclusionsIn this study, adopting the idea from the field of linguistics in the field of bioinformatics, we develop a new information theoretic method, PC-TraFF, for the identification of potentially collaborating transcription factors based on the idiosyncrasy of their binding site distributions on the genome. The results of our study show that PC-TraFF can succesfully identify known interacting TF pairs and thus its currently biologically uncorfirmed predictions could provide new hypotheses for further experimental validation. Additionally, the comparison of the results of PC-TraFF with the results of previous methods demonstrates that different methods with their specific scopes can perfectly supplement each other. Overall, our analyses indicate that PC-TraFF is a time-efficient method where its algorithm has a tractable computational time and memory consumption.The PC-TraFF server is freely accessible at http://pctraff.bioinf.med.uni-goettingen.de/ Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0827-2) contains supplementary material, which is available to authorized users.
Today, it is well-known that in eukaryotic cells the complex interplay of transcription factors (TFs) bound to the DNA of promoters and enhancers is the basis for precise and specific control of transcription. Computational methods have been developed for the identification of potentially cooperating TFs through the co-occurrence of their binding sites (TFBSs). One challenge of these methods is the differentiation of TFBS pairs that are specific for a given sequence set from those that are ubiquitously appearing, rendering the results highly dependent on the choice of a proper background set. Here, we present an extension of our previous PC-TraFF approach that estimates the background co-occurrence of any TF pair by preserving the (oligo-) nucleotide composition and, thus, the core of TFBSs in the sequences of interest. Applying our approach to a simulated data set with implanted TFBS pairs, we could successfully identify them as sequence-set specific under a variety of conditions. When we analyzed the gene expression data sets of five breast cancer associated subtypes, the number of overlapping pairs could be dramatically reduced in comparison to our previous approach. As a result, we could identify potentially cooperating transcriptional regulators that are characteristic for each of the five breast cancer subtypes. This indicates that our approach is able to discriminate specific potential TF cooperations against ubiquitously occurring combinations. The results obtained with our method may help to understand the genetic programs governing specific biological processes such as the development of different tumor types.
Transcription factors (TFs) are a special class of DNA-binding proteins that orchestrate gene transcription by recruiting other TFs, co-activators or co-repressors. Their combinatorial interplay in higher organisms maintains homeostasis and governs cell identity by finely controlling and regulating tissue-specific gene expression. Despite the rich literature on the importance of cooperative TFs for deciphering the mechanisms of individual regulatory programs that control tissue specificity in several organisms such as human, mouse, or Drosophila melanogaster , to date, there is still need for a comprehensive study to detect specific TF cooperations in regulatory processes of cattle tissues. To address the needs of knowledge about specific combinatorial gene regulation in cattle tissues, we made use of three publicly available RNA-seq datasets and obtained tissue-specific gene (TSG) sets for ten tissues (heart, lung, liver, kidney, duodenum, muscle tissue, adipose tissue, colon, spleen and testis). By analyzing these TSG-sets, tissue-specific TF cooperations of each tissue have been identified. The results reveal that similar to the combinatorial regulatory events of model organisms, TFs change their partners depending on their biological functions in different tissues. Particularly with regard to preferential partner choice of the transcription factors STAT 3 and NR 2 C 2, this phenomenon has been highlighted with their five different specific cooperation partners in multiple tissues. The information about cooperative TFs could be promising: i) to understand the molecular mechanisms of regulating processes; and ii) to extend the existing knowledge on the importance of single TFs in cattle tissues.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.