2019
DOI: 10.3389/fmicb.2019.00507
|View full text |Cite
|
Sign up to set email alerts
|

Identification of Phage Viral Proteins With Hybrid Sequence Features

Abstract: The uniqueness of bacteriophages plays an important role in bioinformatics research. In real applications, the function of the bacteriophage virion proteins is the main area of interest. Therefore, it is very important to classify bacteriophage virion proteins and non-phage virion proteins accurately. Extracting comprehensive and effective sequence features from proteins plays a vital role in protein classification. In order to more fully represent protein information, this paper is more comprehensive and effe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(18 citation statements)
references
References 120 publications
0
18
0
Order By: Relevance
“…Redundant or irrelevant features will decrease the accuracy of prediction and increase computational time. In order to remove redundant or irrelevant features, a variety of feature selection techniques have been proposed: the analysis of variance (ANOVA) ( Tan et al, 2018 ; Li et al, 2019 ; Zhang et al, 2020a ), Max-Relevance-Max-Distance algorithms (MRMD) ( Zou et al, 2016 ; Wan et al, 2017 ; Ru et al, 2019 ; Kwon et al, 2020 ), and Minimal-Redundancy-Maximal-Relevance (MRMR) ( Jiao and Du, 2016 ; Xu et al, 2016 ; Wang et al, 2018b ; Kabir et al, 2020 ) are the representative feature selection algorithms. In this study, we selected features using the F-score algorithm; the F-score algorithm was proposed by Yi-Wei ( Chen and Lin, 2006 ).…”
Section: Methodsmentioning
confidence: 99%
“…Redundant or irrelevant features will decrease the accuracy of prediction and increase computational time. In order to remove redundant or irrelevant features, a variety of feature selection techniques have been proposed: the analysis of variance (ANOVA) ( Tan et al, 2018 ; Li et al, 2019 ; Zhang et al, 2020a ), Max-Relevance-Max-Distance algorithms (MRMD) ( Zou et al, 2016 ; Wan et al, 2017 ; Ru et al, 2019 ; Kwon et al, 2020 ), and Minimal-Redundancy-Maximal-Relevance (MRMR) ( Jiao and Du, 2016 ; Xu et al, 2016 ; Wang et al, 2018b ; Kabir et al, 2020 ) are the representative feature selection algorithms. In this study, we selected features using the F-score algorithm; the F-score algorithm was proposed by Yi-Wei ( Chen and Lin, 2006 ).…”
Section: Methodsmentioning
confidence: 99%
“…It could be noticed that the property of side-chain [47] of amino acid had a high negative correlation (R = −0.516), indicating that PVPs favor small amino acids. As seen in Table 8, five top-ranked important amino acids present in PVPs were Ala, Thr, Val, Gly, and Ser, while according to the property of side-chain [50], such five amino acids (PS, side-chain) were ranked at (1,19), (2,15), (3,16), (4,20), and (5,18), respectively. Our analysis result was well consistent with the work of a previous study [45].…”
Section: Analysis Of Pvps Using Informative Physicochemical Propertiesmentioning
confidence: 99%
“…Recently, many researchers have exploited various types of machine learning (ML) algorithms using sequence features to directly predict PVPs including Seguritan et al's method [8], Feng et al's method [9], PVPred [10], Zhang et al's method [11], PVP-SVM [12], PhagePred [13], Tan et al's method [14], Ru et al's method [15], and Pred-BVP-Unb [16], as summarized in Table 1. In 2012, Seguritan et al [8] proposed the first predictor to identify viral structural proteins using an artificial neural network cooperating with a feature combination of amino acid composition (AAC) and protein isoelectric points.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the diversity of PVPs is much higher than that of the enzymes encoded in the phage genome, which makes the identification of PVPs much more difficult (11). To overcome the difficulty of PVP identification, several de novo algorithms for PVP identification have been proposed (8,(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23). Most of these tools are twoclass classifiers to distinguish whether the given phage protein is PVP.…”
Section: Introductionmentioning
confidence: 99%