2020
DOI: 10.37044/osf.io/xt7gw
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Determining a novel feature-space for SARS-CoV-2 sequence data

Abstract: The pandemicity & the ability of the SARS-COV-2 to reinfect a cured subject, among other damaging characteristics of it, took everybody by surprise. A global collaborative scientific effort was direly required to bring learned people from different niches of medicine & data science together. Such a platform was provided by COVID19 Virtual BioHackathon, organized from the 5th to the 11th of April, 2020, to ponder on the related pressing issues varying in their diversity from text mining to genom… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 15 publications
0
1
0
Order By: Relevance
“…Due to the low mutation rate and a high degree of similarity among SARS-COV-2 genome, very few studies have been performed using alignment-free methods. Correlation and partial information correlation (PIC) [57], optimal word (k-mer) to construct continuous distributed representations for protein sequences, to predict MHC class I and II binding affinity [58], combined k-mer and n-gram techniques [59], chemical properties (charge, hydropathy, side chain) of region-specific amino acid, mutations are few important features used during analysis [60,61]. These features are utilized in several studies attempting to identify the cluster of SARS-COV-2 in various ways, and its results demonstrated that this virus has multiple origins with different degrees.…”
Section: Phylogeny and Mutant Variation Analysismentioning
confidence: 99%
“…Due to the low mutation rate and a high degree of similarity among SARS-COV-2 genome, very few studies have been performed using alignment-free methods. Correlation and partial information correlation (PIC) [57], optimal word (k-mer) to construct continuous distributed representations for protein sequences, to predict MHC class I and II binding affinity [58], combined k-mer and n-gram techniques [59], chemical properties (charge, hydropathy, side chain) of region-specific amino acid, mutations are few important features used during analysis [60,61]. These features are utilized in several studies attempting to identify the cluster of SARS-COV-2 in various ways, and its results demonstrated that this virus has multiple origins with different degrees.…”
Section: Phylogeny and Mutant Variation Analysismentioning
confidence: 99%