2021
DOI: 10.3389/fgene.2021.658078
|View full text |Cite
|
Sign up to set email alerts
|

Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning

Abstract: Understanding the substrate specificity of HIV-1 protease plays an essential role in the prevention of HIV infection. A variety of computational models have thus been developed to predict substrate sites that are cleaved by HIV-1 protease, but most of them normally follow a supervised learning scheme to build classifiers by considering experimentally verified cleavable sites as positive samples and unknown sites as negative samples. However, certain noisy can be contained in the negative set, as false negative… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(20 citation statements)
references
References 31 publications
0
17
1
Order By: Relevance
“…Furthermore, using diverse crossvalidated performance metrics is considered good practice and can objectively reveal the true performance of a model rather than depending on just one metric that could be biased towards a subset of the dataset. This approach reduces the risk of overfitting [6,18,35]. Our models performance was also evaluated on an independent testing set which was not previously exposed to the models to give a true account of the models predictive strength.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, using diverse crossvalidated performance metrics is considered good practice and can objectively reveal the true performance of a model rather than depending on just one metric that could be biased towards a subset of the dataset. This approach reduces the risk of overfitting [6,18,35]. Our models performance was also evaluated on an independent testing set which was not previously exposed to the models to give a true account of the models predictive strength.…”
Section: Resultsmentioning
confidence: 99%
“…As a result, feature selection is critical in classification tasks [35]. The number and type of peptide feature descriptors selected determine to a great extent the performance of a model [6]. Amino acids are the building blocks of peptides and proteins.…”
Section: Feature Extraction/vector Constructionmentioning
confidence: 99%
“…As a popular metric for binary classification problems, F-measure indicates the harmonic mean of Precision and Recall. The details of computing F-measure can be found in [ 31 ].…”
Section: Methodsmentioning
confidence: 99%
“…Combining the knowledge from experimental studies, a multitask learning model is developed recently based on multi-kernel [ 30 ], and it utilizes the dependencies among various related tasks to build a stronger predictive model for HIV-1 protease cleavage sites prediction. Since certain noisy can be contained by mislabeling cleavable octamers as negative instances, PU-HIV [ 31 ] considers unknown substrate sites as unlabeled samples, and makes use of positive-unlabeled learning to effectively predict HIV-1 protease cleavage sites.
Fig.
…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation