2017
DOI: 10.1371/journal.pone.0188129
|View full text |Cite
|
Sign up to set email alerts
|

On the prediction of DNA-binding proteins only from primary sequences: A deep learning approach

Abstract: DNA-binding proteins play pivotal roles in alternative splicing, RNA editing, methylating and many other biological functions for both eukaryotic and prokaryotic proteomes. Predicting the functions of these proteins from primary amino acids sequences is becoming one of the major challenges in functional annotations of genomes. Traditional prediction methods often devote themselves to extracting physiochemical features from sequences but ignoring motif information and location information between motifs. Meanwh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
43
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
10

Relationship

2
8

Authors

Journals

citations
Cited by 54 publications
(43 citation statements)
references
References 32 publications
0
43
0
Order By: Relevance
“…These methods primarily focus on the following two aspects: (1) the construction of encoding schemes for protein sequences and (2) the application of classification algorithms. Many machine learning techniques have been adopted to perform the prediction of DBPs, including support vector machine (SVM) [5][6][7], random forest (RF) [8][9][10], naive Bayes classifier [4], ensemble classifiers [11][12][13], and deep learning [14][15][16]. Among these algorithms, SVM and RF have been widely used because of their excellent performance.…”
Section: Introductionmentioning
confidence: 99%
“…These methods primarily focus on the following two aspects: (1) the construction of encoding schemes for protein sequences and (2) the application of classification algorithms. Many machine learning techniques have been adopted to perform the prediction of DBPs, including support vector machine (SVM) [5][6][7], random forest (RF) [8][9][10], naive Bayes classifier [4], ensemble classifiers [11][12][13], and deep learning [14][15][16]. Among these algorithms, SVM and RF have been widely used because of their excellent performance.…”
Section: Introductionmentioning
confidence: 99%
“…Prediction of DBPs is usually formulated as a supervised learning problem. In recent years, many classification algorithms have been adopted to solve this problem, including support vector machine (SVM) [24][25][26], random forest (RF) [27,28], naive Bayes classifier [3], ensemble classifiers [29][30][31], and deep learning [32][33][34]. Among these models, stacked generalization (or stacking) is an ensemble learning technique that takes the outputs of base classifiers as input and attempts to find the optimal combination of the base learners to make a better prediction [35].…”
Section: Introductionmentioning
confidence: 99%
“…Pan et al adopted a CNN and deep belief network (DBN) to discover RNA–protein binding motifs using different groups of features, that is, sequence, structure, clip-cobinding, region type, and motif features [ 33 ]. In our previous work, we developed a deep learning model for predicting DNA binding proteins only from sequences, in which the features were automatically learned by the networks themselves [ 34 ]. In this study, we further investigated the capability of auto feature engineering in the deep learning framework for predicting PPIs.…”
Section: Introductionmentioning
confidence: 99%