2017
DOI: 10.1093/nar/gkx314
|View full text |Cite
|
Sign up to set email alerts
|

RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections

Abstract: Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif di… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
92
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 104 publications
(92 citation statements)
references
References 69 publications
0
92
0
Order By: Relevance
“…We first filtered this file for confidence scores greater than 0. Next, we collapsed TFs into clusters based on RSAT matrix clustering of JASPAR CORE vertebrate TFs (http://jaspar.genereg.net/matrix-clusters/vertebrates/) [51]. We named each cluster using the most common TF family within the cluster.…”
Section: Differential Signal Analysis and Regulatory Motif Enrichmentmentioning
confidence: 99%
“…We first filtered this file for confidence scores greater than 0. Next, we collapsed TFs into clusters based on RSAT matrix clustering of JASPAR CORE vertebrate TFs (http://jaspar.genereg.net/matrix-clusters/vertebrates/) [51]. We named each cluster using the most common TF family within the cluster.…”
Section: Differential Signal Analysis and Regulatory Motif Enrichmentmentioning
confidence: 99%
“…variation-scan performance was assessed by randomly selecting a variant from the 1000 genomes project [21] and a motif from the RSAT non-redundant motifs collection [24] . The randomly selected variant was used to create sets with different numbers of replicates, ranging from one thousand to nine millions, to estimate the relation between running time and the amount of evaluated variants.…”
Section: Computing Efficiencymentioning
confidence: 99%
“…According to the original study [25] , binding sites for the following TFs were enriched in the sequences of interest: GATA1, KLF1, DHS, TAL1, ETS, FLI1 and AP-1. Therefore, a total of 48 PSSMs annotated as related to these TFs were retrieved from the non-redundant RSAT motif collection [24] , and given as input to variation-scan .…”
Section: Evaluation Of Variation-scanmentioning
confidence: 99%
See 2 more Smart Citations