2021
DOI: 10.3390/biology10090853
|View full text |Cite
|
Sign up to set email alerts
|

An Alignment-Independent Approach for the Study of Viral Sequence Diversity at Any Given Rank of Taxonomy Lineage

Abstract: The study of viral diversity is imperative in understanding sequence change and its implications for intervention strategies. The widely used alignment-dependent approaches to study viral diversity are limited in their utility as sequence dissimilarity increases, particularly when expanded to the genus or higher ranks of viral species lineage. Herein, we present an alignment-independent algorithm, implemented as a tool, UNIQmin, to determine the effective viral sequence diversity at any rank of the viral taxon… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(21 citation statements)
references
References 42 publications
0
13
0
Order By: Relevance
“…As per the definition of the minimal set, the k-mer size needs to be defined. Various k-mer sizes may be explored; see Chong et al 2021 for the various considerations. Herein, as an example, we will utilise the k-mer size of nine (9-mer) for immunological applications, such as studying the viral diversity in the context of the cellular immune response (antigenic diversity).…”
Section: Basic Protocol 2: Generating a Minimal Set For Any Given Dat...mentioning
confidence: 99%
“…As per the definition of the minimal set, the k-mer size needs to be defined. Various k-mer sizes may be explored; see Chong et al 2021 for the various considerations. Herein, as an example, we will utilise the k-mer size of nine (9-mer) for immunological applications, such as studying the viral diversity in the context of the cellular immune response (antigenic diversity).…”
Section: Basic Protocol 2: Generating a Minimal Set For Any Given Dat...mentioning
confidence: 99%
“…The 27 protein deduplicated datasets were then further compressed, through the removal of unique sequences by use of UNIQmin without incurring any loss of information in terms of the total peptidome repertoire (relevant to the k-mer of choice). The overlapping k-mer size of nine (9-mer or nonamer) was selected for immunological applications (see Chong et al, 2021 for various considerations on k-mer size). The resulting 27 protein minimal datasets totalled 273,851 sequences (~0.5% of the retrieved dataset) (Figure 2(B); minimal datasets are available at https://github.com/ChongLC/UNIQmin_PublicationData/tree/main/ApplicationPaper_SARS-CoV-2).…”
Section: (B))mentioning
confidence: 99%
“…Sequence diversity is a major obstacle in the design of effective surveillance and intervention (vaccines, drugs, and diagnostics) strategies against viruses. Primary sequences are treasure troves for studies of sequence diversity, which can be alignment-dependent or alignment-free 1 . Big sequence data poses a major challenge to alignment-dependent approach, which is root to many sequence-based comparison studies.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations