2022
DOI: 10.3389/fgene.2022.858252
|View full text |Cite
|
Sign up to set email alerts
|

Genomic Surveillance of COVID-19 Variants With Language Models and Machine Learning

Abstract: The global efforts to control COVID-19 are threatened by the rapid emergence of novel SARS-CoV-2 variants that may display undesirable characteristics such as immune escape, increased transmissibility or pathogenicity. Early prediction for emergence of new strains with these features is critical for pandemic preparedness. We present Strainflow, a supervised and causally predictive model using unsupervised latent space features of SARS-CoV-2 genome sequences. Strainflow was trained and validated on 0.9 million … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 36 publications
0
8
0
Order By: Relevance
“…These models have incorporated various data sources, including comorbid diseases, 57,58 clinical factors, 59,60 genetic factors 39,42,61 and SARS-COV-2 viral clades. [62][63][64] Given the promising results obtained from therapeutic approaches, including complement inhibition, 65 in the treatment of COVID-19, the development of reliable prediction tools based on complement-related variants is of utmost importance.…”
Section: Rankingmentioning
confidence: 99%
“…These models have incorporated various data sources, including comorbid diseases, 57,58 clinical factors, 59,60 genetic factors 39,42,61 and SARS-COV-2 viral clades. [62][63][64] Given the promising results obtained from therapeutic approaches, including complement inhibition, 65 in the treatment of COVID-19, the development of reliable prediction tools based on complement-related variants is of utmost importance.…”
Section: Rankingmentioning
confidence: 99%
“…AI and machine learning can be used to analyze large datasets, such as genomic data, to identify patterns and trends relevant to the understanding and treatment of infectious diseases [ 9 , 10 , 11 , 12 ]. For example, machine learning algorithms have been utilized to identify potential drug targets for SARS-CoV-2, which causes COVID-19 [ 13 , 14 ].…”
Section: An Opinionmentioning
confidence: 99%
“…For example, a recent study [ 164 ] highlights that through genomic surveillance it is possible to trace co-infections by distinct SARS-CoV-2 genotypes, which are expected to have a different impact on factors modulating COVID-19. Genomic surveillance of SARS-CoV-2 is able to reveal tremendous genomic diversity [ 165 ], and coupled with language models and machine learning approaches, contributes to predicting the impact of mutations (such as those occurring in the spike protein), and thus can better address challenging aspects, like an estimation of the efficacy of therapeutic treatments [ 166 ].…”
Section: Factors Modulating Covid-19: a Mechanistic Understanding Via...mentioning
confidence: 99%