2010
DOI: 10.1371/journal.pone.0009722
|View full text |Cite
|
Sign up to set email alerts
|

Dinucleotide Weight Matrices for Predicting Transcription Factor Binding Sites: Generalizing the Position Weight Matrix

Abstract: BackgroundIdentifying transcription factor binding sites (TFBS) in silico is key in understanding gene regulation. TFBS are string patterns that exhibit some variability, commonly modelled as “position weight matrices” (PWMs). Though convenient, the PWM has significant limitations, in particular the assumed independence of positions within the binding motif; and predictions based on PWMs are usually not very specific to known functional sites. Analysis here on binding sites in yeast suggests that correlation o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
86
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
2
2
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 87 publications
(88 citation statements)
references
References 33 publications
2
86
0
Order By: Relevance
“…We then determined the corresponding rank sum p-value between the random sets. We found that the p-value of the rank sum test between the DC and DU model fell well beyond the right tail of the random sampling distribution (shown in Figure 2), indicating that the median 30]bp window, which we calculated to have 9.1 bits information relative to the background (uniform distribution of bases), and logo E has the added flanking sites to the 'uncooperative' class. Logo C is the CB motif with 9.6 bits of information relative to the background, which looks similar to the 'uncooperative' class at position 6 due to there being many more sites that prefer A to a T at this position amongst all the Dorsal sites in the network.…”
Section: The Conditional and Unconditional Pwms Are Significantly Difmentioning
confidence: 85%
See 3 more Smart Citations
“…We then determined the corresponding rank sum p-value between the random sets. We found that the p-value of the rank sum test between the DC and DU model fell well beyond the right tail of the random sampling distribution (shown in Figure 2), indicating that the median 30]bp window, which we calculated to have 9.1 bits information relative to the background (uniform distribution of bases), and logo E has the added flanking sites to the 'uncooperative' class. Logo C is the CB motif with 9.6 bits of information relative to the background, which looks similar to the 'uncooperative' class at position 6 due to there being many more sites that prefer A to a T at this position amongst all the Dorsal sites in the network.…”
Section: The Conditional and Unconditional Pwms Are Significantly Difmentioning
confidence: 85%
“…The OR gate is based on the DC detector built from the data set D DC , which contains Dorsal loci from D CB that were tagged with class labels from the optimal spacer window of [0,30]bp with the 5'-CAYATG motif, and similarly, the DU detector is built from the data set D DU , which contains the remaining Dorsal loci from D CB that did not have the Twist sites in the spacer window. .…”
Section: Performance Of Optimal Classifiers (Detectors)mentioning
confidence: 99%
See 2 more Smart Citations
“…10 Consequently, a key question in the analysis of gene regulatory networks is to find a 11 proper mathematical representation of the sequence-specificities of TFs. That is, for 12 each TF, we want to determine an energy function E(s) that calculates, for any given 13 DNA segment s, the binding free energy of the TF binding to s. The segment s is 14 generally of fixed length for a given TF, which typically ranges from 6 to 30 base pairs. 15 Although there have been some attempts to use direct structural and biophysical 16 modeling of the sequence-specificity of TFs, e.g.…”
mentioning
confidence: 99%