2015
DOI: 10.1093/bioinformatics/btv645
|View full text |Cite
|
Sign up to set email alerts
|

No Promoter Left Behind (NPLB): learn de novo promoter architectures from genome-wide transcription start sites

Abstract: Summary: Promoters have diverse regulatory architectures and thus activate genes differently. For example, some have a TATA-box, many others do not. Even the ones with it can differ in its position relative to the transcription start site (TSS). No Promoter Left Behind (NPLB) is an efficient, organism-independent method for characterizing such diverse architectures directly from experimentally identified genome-wide TSSs, without relying on known promoter elements. As a test case, we show its application in id… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2

Relationship

3
2

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 10 publications
0
12
0
Order By: Relevance
“…We were able to replicate past results of clustering on both these datasets. Our algorithm on the fly dataset achieved a speedup of 14.23x over the best existing algorithm [2], which also uses unsupervised learning with binary search (see table below).…”
Section: Resultsmentioning
confidence: 93%
See 1 more Smart Citation
“…We were able to replicate past results of clustering on both these datasets. Our algorithm on the fly dataset achieved a speedup of 14.23x over the best existing algorithm [2], which also uses unsupervised learning with binary search (see table below).…”
Section: Resultsmentioning
confidence: 93%
“…Binary Search Optimization . Instead of iterating over all K cluster candidates as done in findKOptimal , we use binary search, similar to previous work [2]. In each iteration we reduce the search space for k op t by half, depending on W k ( H ) values.…”
Section: Procedures Findkoptimalmentioning
confidence: 99%
“…As a pilot study, we consider the simple task of finding the optimal number of PWMs for each of the 158 JASPAR data sets. We compare the FIC-based learning approach of Disentangler with two other tools that can be used to solve this task, namely DIVERSITY ( 22 ) and NPLB ( 46 ); see Supplementary Section S2 for details about all tools used in the case studies. Results ( Supplementary Section S4.1 ) show that for the majority of data sets all three methods predict more than one PWM to be optimal.…”
Section: Resultsmentioning
confidence: 99%
“…This, it turns out, constitutes an effective and fast implementation of an ab initio motif finder on large ChIP-seq datasets, in addition to detecting the variations in motif and sequence context alluded to in the previous point. THiCweed can also be used on sequences that have been previously aligned by a ‘feature’ (motif) to discover additional motifs/complexities, by disabling shifts and reverse complements, similar to the program No Promoter Left Behind ( 20 , 21 ), but we do not discuss this use here.…”
Section: Methodsmentioning
confidence: 99%
“…THiCweed can also be used on sequences that have been previously aligned by a ‘feature’ (motif) to discover additional motifs/complexities, by disabling shifts and reverse complements, similar to the program No Promoter Left Behind ( 20 , 21 ), but we do not discuss this use here.…”
Section: Methodsmentioning
confidence: 99%