2022
DOI: 10.1007/s41870-022-00949-2
|View full text |Cite
|
Sign up to set email alerts
|

Word2vec neural model-based technique to generate protein vectors for combating COVID-19: a machine learning approach

Abstract: The world was ambushed in 2019 by the COVID-19 virus which affected the health, economy, and lifestyle of individuals worldwide. One way of combating such a public health concern is by using appropriate, rapid, and unbiased diagnostic tools for quick detection of infected people. However, a current dearth of bioinformatics tools necessitates modeling studies to help diagnose COVID-19 cases. Molecular-based methods such as the real-time reverse transcription polymerase chain reaction (rRT-PCR) for detecting COV… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 50 publications
0
5
0
Order By: Relevance
“…Using NLP techniques, particularly the Word2Vec model for viral embedding, has been recognized in prior research. But while earlier studies were often unimodal and focused on tasks like viral classification or evolution tracking, our method differs by integrating this with other data modalities, offering a more comprehensive view [48][49][50][51] . The merit of this method is evident in its ability to encapsulate the cumulative effects of multiple viral mutations and their relationships, a task that single-modal approaches might find challenging.…”
Section: Discussionmentioning
confidence: 99%
“…Using NLP techniques, particularly the Word2Vec model for viral embedding, has been recognized in prior research. But while earlier studies were often unimodal and focused on tasks like viral classification or evolution tracking, our method differs by integrating this with other data modalities, offering a more comprehensive view [48][49][50][51] . The merit of this method is evident in its ability to encapsulate the cumulative effects of multiple viral mutations and their relationships, a task that single-modal approaches might find challenging.…”
Section: Discussionmentioning
confidence: 99%
“…Another limitation in word embedding is separating opposite word pairs such as “black” and “white”. The word pairs like these are usually semantically very close in vector space hence reducing the performance of word vectors in tasks such as sentiment analysis [ 31 33 ].…”
Section: Methodsmentioning
confidence: 99%
“…In addition to employing one-hot encoding, the utilization of complex feature representations for the drug SMILES and target sequences could potentially exert a substantial impact on the performance of DTA prediction. Within the scope of this investigation, a variety of complex feature representations were explored for the sequences of targets, encompassing the physical-chemical properties of amino acids, position-specific scoring matrix (PSSM) [29], Hidden Markov Models matrix (HMM) [30], Protbert [31], CNN, and Word2Vec [32]. For drugs, only the physical-chemical properties of atoms were considered.…”
Section: Performance Comparison Of Different Graph-based Modelsmentioning
confidence: 99%