2015
DOI: 10.1186/s12859-015-0797-4
|View full text |Cite
|
Sign up to set email alerts
|

Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data

Abstract: BackgroundStatistical modeling of transcription factor binding sites is one of the classical fields in bioinformatics. The position weight matrix (PWM) model, which assumes statistical independence among all nucleotides in a binding site, used to be the standard model for this task for more than three decades but its simple assumptions are increasingly put into question. Recent high-throughput sequencing methods have provided data sets of sufficient size and quality for studying the benefits of more complex mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
28
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 38 publications
(28 citation statements)
references
References 44 publications
(59 reference statements)
0
28
0
Order By: Relevance
“…There too, the likelihood is decomposed into factors depending on the configurations of other variables, and each part in the partitioning is modelled independently using the multinomial NML formula. The authors demonstrate that the fNMLstyle criterion they propose leads to parsimonious models with good predictive accuracy for a wide range of different scenarios, whereas the corresponding Bayesian scores are sensitive to the choice of the prior hyperparameters, which is important in the application where parsimonious Markov chains are used to model DNA binding sites [Eggeling et al, 2015].…”
Section: Factorized Nml and Variantsmentioning
confidence: 99%
“…There too, the likelihood is decomposed into factors depending on the configurations of other variables, and each part in the partitioning is modelled independently using the multinomial NML formula. The authors demonstrate that the fNMLstyle criterion they propose leads to parsimonious models with good predictive accuracy for a wide range of different scenarios, whereas the corresponding Bayesian scores are sensitive to the choice of the prior hyperparameters, which is important in the application where parsimonious Markov chains are used to model DNA binding sites [Eggeling et al, 2015].…”
Section: Factorized Nml and Variantsmentioning
confidence: 99%
“…Seifert et al (2012) used PCTs for augmenting higher-order Hidden Markov models to improve Array-CGH analysis. Another well-studied application models DNA sequence patterns that are of importance for gene regulation (Eggeling et al 2014a(Eggeling et al , 2015b. Here, PCTs augments an inhomogeneous Markov model that can be viewed a Bayesian network of fixed structure where the parents of each variable are the direct predecessors in the sequence.…”
Section: Introductionmentioning
confidence: 99%
“…7.8 in this article. Second, it can be used for unsupervised learning tasks, such as de novo motif discovery (Eggeling et al 2014a(Eggeling et al , 2015b(Eggeling et al , 2017 or as component of a mixture model (Eggeling et al 2017;Eggeling 2018), where learning is possible only through an iterative approach such as the EM algorithm (Dempster et al 1977) or variants thereof (Nielsen 2000;Fujimaki and Morinaga 2012). Third, it allows for an intuitive model visualization through a conditional sequence logo (Eggeling et al 2017) that is a direct generalization of the popular sequence logo (Schneider and Stephens 1990).…”
Section: Introductionmentioning
confidence: 99%
“…It has been found that taking into consideration the intra-motif dependencies will substantially improve the accuracy of de novo motif discovery [9]. Therefore, many new statistical methods have been developed to characterize the intra-motif dependencies, which include the generalized weight matrix model [10], sparse local inhomogeneous mixture model (Slim) [11], transcription factor flexible model based on hidden Markov models (TFFMs) [12], the binding energy model (BEM) [13], and the inhomogeneous parsimonious Markov model (PMM) [14]. However, the most commonly used visualization tools such as WebLogo [3], Seq2Logo [15], and pLogo [4] can only display individual symbol stacks and ignore the intra-dependences within the motif.…”
Section: Introductionmentioning
confidence: 99%