2020
DOI: 10.1101/2020.10.15.341149
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predicting Cell-Penetrating Peptides: Building and Interpreting Random Forest based prediction Models

Abstract: Targeting intracellular pathways with peptide drugs is becoming increasingly desirable but often limited in application due to their poor cell permeability. Understanding cellular permeability of peptides remains a major challenge with very little structure-activity relationship known. Fortunately, there exist a class of peptides called Cell-Penetrating Peptides (CPPs), which have the ability to cross cell membranes and are also capable of delivering biologically active cargo into cells. Discovering patterns t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 42 publications
0
6
0
Order By: Relevance
“…Previous studies have suggested that uptake efficiency of CPPs are correlated with sequence length and basic residue (arginine or lysine) positions (Futaki et al, 2007 ; Liu et al, 2016 ; Yadahalli & Verma, 2020 ). To further evaluate whether peptide P1 is sensitive to changes in amino acid sequences, peptide truncation ( Figure 2(A) ) and single mutation ( Figure 2(B) ) prediction by CellPPD were performed.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Previous studies have suggested that uptake efficiency of CPPs are correlated with sequence length and basic residue (arginine or lysine) positions (Futaki et al, 2007 ; Liu et al, 2016 ; Yadahalli & Verma, 2020 ). To further evaluate whether peptide P1 is sensitive to changes in amino acid sequences, peptide truncation ( Figure 2(A) ) and single mutation ( Figure 2(B) ) prediction by CellPPD were performed.…”
Section: Resultsmentioning
confidence: 99%
“…Previous studies have suggested that uptake efficiency of CPPs are correlated with sequence length and basic residue (arginine or lysine) positions (Futaki et al, 2007;Liu et al, 2016;Yadahalli & Verma, 2020). To further evaluate whether Truncation analysis in Figure 2(A) suggested that 15-mer and 10-mer truncated peptide P1 fragments have significantly decreased penetration property, although the first 10-mer of N-terminal peptide P1 still had a higher score than the fulllength peptide P1, which may be due to the core motif that determines the penetration property of peptide P1.…”
Section: Penetration Properties and Immunogenicity Prediction Of Peptide P1mentioning
confidence: 99%
“…csv Listing 1: Example TRILL commands for CPP workflow Cell penetrability is an example of a protein function that BLAST/HMMs routinely fail in identifying due to convergent properties without sharing common ancestry. Utilizing Dataset E from Yadahalli 2020 [8], we first trained an XGBoost classifier on the protein embeddings from ESM2-150M and then achieved an F1 of 0.876 on a held-out 25% of the CPPs (Figure 5). We then finetuned ProtGPT2 on the 955 CPPs for 10 epochs with a learning rate of 1e − 5.…”
Section: Workflow 2: Family Based Protein Generationmentioning
confidence: 99%
“…While these methods rely on evolutionary relationships to link related sequences through homology, machine learning based methods have shown success for functional comparisons without needing shared ancestry. For example, researchers have been able to predict whether a given protein is a cell-penetrating peptide, regardless of actual homology TRILL [8]. These predictions were enabled by extracting amino acid frequencies and biochemical properties for each protein and using this data to train random-forest classifiers.…”
Section: Introductionmentioning
confidence: 99%
“…These predictors generally help with the design of a first generation CPP, but may also help to further modify already known CPPs to suit specific cargo and application. There are still some limitations on current prediction models as they are dependent on the quality of input data and the data used for training (Yadahalli and Verma, 2020).…”
Section: Prediction Of Cppsmentioning
confidence: 99%