2021
DOI: 10.1101/2021.11.02.466801
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Estimated limits of organism-specific training for epitope prediction

Abstract: BackgroundThe identification of linear B-cell epitopes remains an important task in the development of vaccines, therapeutic antibodies and several diagnostic tests. Machine learning predictors are trained to flag potential epitope candidates for experimental validation and currently, most predictors are trained as generalist models using large, heterogeneous data sets. Recently, organism-specific training has been shown to improve prediction performance for data-rich organisms. Unfortunately, for most organis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 47 publications
2
1
0
Order By: Relevance
“…Beyond the detection of potential Monkepox LBCEs, the phylogeny-aware approach used in this work has been shown to provide better results than the current standard tool for LBCE prediction (Bepipred 2.0), reinforcing earlier results that already suggested this to be the case [19]. More importantly, the successful development of model tailored specifically for an emerging global pathogen with almost no LBCE data available shows that this approach is useful even beyond the lower bounds investigated in an earlier work [31], and that utilising information from phylogenetically-related pathogens can be a successful strategy for the development of predictive models tailored for specific groups of pathogens that share a common ancestor. With the current wide availability of considerable computing power even on low-cost and mobile devices, there is clearly scope for a wider use of pathogen-specific models, trained and tuned for the prediction task of interest.…”
Section: Discussionsupporting
confidence: 77%
See 1 more Smart Citation
“…Beyond the detection of potential Monkepox LBCEs, the phylogeny-aware approach used in this work has been shown to provide better results than the current standard tool for LBCE prediction (Bepipred 2.0), reinforcing earlier results that already suggested this to be the case [19]. More importantly, the successful development of model tailored specifically for an emerging global pathogen with almost no LBCE data available shows that this approach is useful even beyond the lower bounds investigated in an earlier work [31], and that utilising information from phylogenetically-related pathogens can be a successful strategy for the development of predictive models tailored for specific groups of pathogens that share a common ancestor. With the current wide availability of considerable computing power even on low-cost and mobile devices, there is clearly scope for a wider use of pathogen-specific models, trained and tuned for the prediction task of interest.…”
Section: Discussionsupporting
confidence: 77%
“…Although this underperformance may seem superficially at odds with the notion of organism-specific model training, we suggest that the extremely low number of OPXV examples may be insufficient to fit models with good predictive abilities. This finding reinforces results obtained in our preliminary exploration of the limits of organism-specific training: at the very low end of data availability the resulting models may not achieve sufficient generalisation ability [31]. It is also interesting to note that the bespoke models (with the exception of the one trained on all viral epitopes) present lower specificity than Bepipred 2.0, but non-inferior PPV and considerably higher sensitivity and NPV.…”
Section: Phylogeny-aware Modelling Yields Superior Predictive Perform...supporting
confidence: 85%
“…Supporting evidence for this assumption was presented in [ 2 ], where epitopes from a number of phylogenetically distant pathogens were found to exhibit clearly distinct patterns in terms of location on a feature space, including the superposition of LBCEs from one pathogen with known non-immunogenic peptides from others. This would compromise the performance of models trained without the use of taxonomic information, and motivated the development of organism-specific models, which were shown to significantly improve performance over generalist models [ 2 , 9 ].…”
Section: Taxon-specific Epitope Predictionmentioning
confidence: 99%
“…In summary, geothermal energy, with its consistent supply, offers an appealing renewable option. However, careful resource management and advances in drilling technology are necessary to fully realize its potential (Ashford & Campelo, 2021).…”
Section: Characteristics and Potentialmentioning
confidence: 99%