2017
DOI: 10.1093/bioinformatics/btx115
|View full text |Cite
|
Sign up to set email alerts
|

Computational modeling of in vivo and in vitro protein-DNA interactions by multiple instance learning

Abstract: Supplementary data are available at Bioinformatics online.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 41 publications
0
14
0
Order By: Relevance
“…For example, in k -mer-based SVM models, there can be a large number of very similar k -mer features that are all significant for the prediction task ( Ghandi et al , 2014 ). To deal with such difficulties, SeqGL ( Setty and Leslie, 2015 ) and MIL ( Gao and Ruan, 2017 ) similarly adopt a DML method (HOMER) to interpret their outputs, while gkmSVM ( Ghandi et al , 2014 ) would cluster k -mers into PWMs for further analysis, which could be viewed as a simplified version of motif learning methods such as ( Liu et al , 2016 ).…”
Section: Discussionmentioning
confidence: 99%
“…For example, in k -mer-based SVM models, there can be a large number of very similar k -mer features that are all significant for the prediction task ( Ghandi et al , 2014 ). To deal with such difficulties, SeqGL ( Setty and Leslie, 2015 ) and MIL ( Gao and Ruan, 2017 ) similarly adopt a DML method (HOMER) to interpret their outputs, while gkmSVM ( Ghandi et al , 2014 ) would cluster k -mers into PWMs for further analysis, which could be viewed as a simplified version of motif learning methods such as ( Liu et al , 2016 ).…”
Section: Discussionmentioning
confidence: 99%
“…Considering the weakly supervised information of DNA sequences, thus it is reasonable to use the concepts of MIL to deal with DNA sequences. Therefore we divided them into multiple overlapping instances following the works 18,20 , which ensures that (1) the weakly supervised information can be retained, and that (2) a large amount of instances containing TFBS are generated. This method is defined as a sliding window of length c , which divides DNA sequences of length l into multiple overlapping instances by a stride s .…”
Section: Methodsmentioning
confidence: 99%
“…In consideration of this information, Gao et al . 18 developed a multiple-instance learning (MIL) based algorithm, which combines MIL with TeamD 19 , for modeling protein-DNA binding, and recently Zhang et al . 20 also developed a weakly supervised convolutional neural network (WSCNN), which combines MIL with CNN, for modeling protein-DNA binding.…”
Section: Introductionmentioning
confidence: 99%
“…In addition, protein binding microarrays (PBM) can be used to measure in vitro transcription factor binding through the array of exhaustive short amino acid sequences on microarrays [12]. Since the common confounding factor was eliminated in the ChIP-Seq experiment [13], PBM data conveyed perfect information in a more direct manner for the modeling of transcription factor binding sites [14].…”
Section: Introductionmentioning
confidence: 99%