Coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus, has emerged as a global pandemic worldwide. In this study, we used ARTIC primers–based amplicon sequencing to profile 225 SARS-CoV-2 genomes from India. Phylogenetic analysis of 202 high-quality assemblies identified the presence of all the five reported clades 19A, 19B, 20A, 20B, and 20C in the population. The analyses revealed Europe and Southeast Asia as two major routes for introduction of the disease in India followed by local transmission. Interestingly, the19B clade was found to be more prevalent in our sequenced genomes (17%) compared to other genomes reported so far from India. Haplotype network analysis showed evolution of 19A and 19B clades in parallel from predominantly Gujarat state in India, suggesting it to be one of the major routes of disease transmission in India during the months of March and April, whereas 20B and 20C appeared to evolve from 20A. At the same time, 20A and 20B clades depicted prevalence of four common mutations 241 C > T in 5′ UTR, P4715L, F942F along with D614G in the Spike protein. D614G mutation has been reported to increase virus shedding and infectivity. Our molecular modeling and docking analysis identified that D614G mutation resulted in enhanced affinity of Spike S1–S2 hinge region with TMPRSS2 protease, possibly the reason for increased shedding of S1 domain in G614 as compared to D614. Moreover, we also observed an increased concordance of G614 mutation with the viral load, as evident from decreased Ct value of Spike and the ORF1ab gene.
COVID-19 that emerged as a global pandemic is caused by SARS-CoV-2 virus. The virus genome analysis during disease spread reveals about its evolution and transmission. We did whole genome sequencing of 225 clinical strains from the state of Odisha in eastern India using ARTIC protocol-based amplicon sequencing. Phylogenetic analysis identified the presence of all five reported clades 19A, 19B, 20A, 20B and 20C in the population. The analyses revealed two major routes for the introduction of the disease in India i.e. Europe and South-east Asia followed by local transmission. Interestingly, 19B clade was found to be much more prevalent in our sequenced genomes (17%) as compared to other genomes reported so far from India. The haplogroup analysis for clades showed evolution of 19A and 19B in parallel whereas the 20B and 20C appeared to evolve from 20A. Majority of the 19A and 19B clades were present in cases that migrated from Gujarat state in India suggesting it to be one of the major initial points of disease transmission in India during month of March and April. We found that with the time 20A and 20B clades evolved drastically that originated from central Europe. At the same time, it has been observed that 20A and 20B clades depicted selection of four common mutations i.e. 241 C>T (5’UTR), P323L in RdRP, F942F in NSP3 and D614G in the spike protein. We found an increase in the concordance of G614 mutation evolution with the viral load in clinical samples as evident from decreased Ct value of spike and Orf1ab gene in qPCR. Molecular modelling and docking analysis identified that D614G mutation enhanced interaction of spike with TMPRSS2 protease, which could impact the shedding of S1 domain and infectivity of the virus in host cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.