2023
DOI: 10.3389/fgene.2023.1165765
|View full text |Cite
|
Sign up to set email alerts
|

ACP-GBDT: An improved anticancer peptide identification method with gradient boosting decision tree

Abstract: Cancer is one of the most dangerous diseases in the world, killing millions of people every year. Drugs composed of anticancer peptides have been used to treat cancer with low side effects in recent years. Therefore, identifying anticancer peptides has become a focus of research. In this study, an improved anticancer peptide predictor named ACP-GBDT, based on gradient boosting decision tree (GBDT) and sequence information, is proposed. To encode the peptide sequences included in the anticancer peptide dataset,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 57 publications
0
4
0
Order By: Relevance
“…GBDT is a popular ensemble learning method for regression and classification, which excels at generating highly accurate predictions by combining multiple decision trees [ 37 ]. GBDT has been used to classify the anticancer and non-anticancer peptides [ 38 ]. Experiments on independent test sets and 10-fold cross-validation sets demonstrated that the GBDT classifier outperformed the other methods.…”
Section: Methodsmentioning
confidence: 99%
“…GBDT is a popular ensemble learning method for regression and classification, which excels at generating highly accurate predictions by combining multiple decision trees [ 37 ]. GBDT has been used to classify the anticancer and non-anticancer peptides [ 38 ]. Experiments on independent test sets and 10-fold cross-validation sets demonstrated that the GBDT classifier outperformed the other methods.…”
Section: Methodsmentioning
confidence: 99%
“…In order to better validate the capability of the model, we adopted four metrics that are most commonly used in the field of bioinformatics [22]: accuracy(ACC), specificity(SP), sensitivity (SN) and area under the curve (AUC), Matthew's correlation coefficient (MCC). These measures are defined as follows:…”
Section: Measurementmentioning
confidence: 99%
“…The Amino Acid Index (AAindex) is a numerical index database that provides information on the physicochemical and biological properties of the 20 amino acids [30]. It provides three categories of lists for delineating amino acid properties.…”
Section: Introductionmentioning
confidence: 99%
“…These lists delve into the biological and chemical attributes of single or paired amino acids, encompassing aspects like charge, polarity, mutability, and contact potential. Indexing amino acid features based on the amino acid index list has become a common method in bioinformatics [30]. Unified representation (UniRep) is a method that transforms any protein sequence into a fixed-length vector representation, addressing the scarcity of protein informatics data by leveraging full utilization of the original sequence [31,32].…”
Section: Introductionmentioning
confidence: 99%