Highlights
Outgroup was not used due to the unknown source of SARS-CoV-2;
16373 SARS-CoV-2 genomes were included in the evolution analysis;
9 key specific sites of highly linkage and 4 major haplotypes were found;
Epidemic trends and possible earlier origins of SARS-CoV-2 were indicated.
AbstractObjectivesTo reveal epidemic trend and possible origins of SARS-CoV-2 by exploring its evolution and molecular characteristics based on a large number of genomes since it has infected millions of people and spread quickly all over the world.MethodsVarious evolution analysis methods were employed.ResultsThe estimated Ka/Ks ratio of SARS-CoV-2 is 1.008 or 1.094 based on 622 or 3624 SARS-CoV-2 genomes, and the time to the most recent common ancestor (tMRCA) was inferred in late September 2019. Further 9 key specific sites of highly linkage and four major haplotypes H1, H2, H3 and H4 were found. The Ka/Ks, detected population size and development trends of each major haplotype showed H3 and H4 subgroups were going through a purify evolution and almost disappeared after detection, indicating H3 and H4 might have existed for a long time, while H1 and H2 subgroups were going through a near neutral or neutral evolution and globally increased with time. Notably the frequency of H1 was generally high in Europe and correlated to death rate (r>0.37).ConclusionsIn this study, the evolution and molecular characteristics of more than 16000 genomic sequences provided a new perspective for revealing epidemiology of SARS-CoV-2.
Background
In order to obtain antibodies that recognize natural proteins, it is possible to predict the antigenic determinants of natural proteins, which are eventually embodied as polypeptides. The polypeptides can be coupled with corresponding vectors to stimulate the immune system to produce corresponding antibodies, which is also a simple and effective vaccine development method. The discovery of epitopes is helpful to the development of SARS-CoV-2 vaccine.
Methods
The analyses were related to epitopes on 3 proteins, including spike (S), envelope (E) and membrane (M) proteins, which are located on the lipid envelope of the SARS-CoV-2. Based on the NCBI Reference Sequence: NC_045512.2, the conformational and linear B cell epitopes of the surface protein were predicted separately by various prediction methods. Furthermore, the conservation of the epitopes, the adaptability and other evolutionary characteristics were also analyzed, the sequences of the whole genome of SARS-CoV-2 were obtained from the GISAID.
Results
7 epitopes were predicted, including 6 linear epitopes and 1 conformational epitope. One of the linear and one of the conformational consist of identical sequence, but represent different forms of epitopes. It is worth mentioning that all 6 identified epitopes were conserved in nearly 3500 SARS-CoV-2 genomes, showing that it is helpful to obtain stable and long-acting epitopes under the condition of high frequency of amino acid mutation, which deserved further study at the experiment level.
Conclusion
The findings would facilitate the vaccine development, had the potential to be directly applied on the prevention in this disease, but also have the potential to prevent the possible threats caused by other types of coronavirus.
BackgroundIn order to obtain antibodies that recognize natural proteins, it is possible to predict the antigenic determinants of natural proteins, which are eventually embodied as polypeptides. The polypeptides can be coupled with corresponding vectors to stimulate the immune system to produce corresponding antibodies, which is also a simple and effective vaccine development method. The discovery of epitopes is helpful to the development of SARS-CoV-2 vaccine.MethodsThe analyses were related to epitopes on 3 proteins, including spike(S), envelope(E) and membrane(M) proteins, which are located on the lipid envelope of the SARS-CoV-2. Based on the NCBI Reference Sequence: NC_045512.2, the conformational and linear B cell epitopes of the surface protein were predicted separately by various prediction methods. Furthermore, the conservation of the epitopes, the adaptability and other evolutionary characteristics were also analyzed, the sequences of the whole genome of SARS-CoV-2 were obtained from the GISAID.Results7 epitopes were predicted, including 6 linear epitopes and 1 conformational epitope. One of the linear and one of the conformational consist of identical sequence, but represent different forms of epitopes. It is worth mentioning that all 6 identified epitopes were conserved in nearly 3500 SARS-CoV-2 genomes, showing that it is helpful to obtain stable and long-acting epitopes under the condition of high frequency of amino acid mutation, which deserved further study at the experiment level.ConclusionThe findings would facilitate the vaccine development, had the potential to be directly applied on the prevention in this disease, but also have the potential to prevent the possible threats caused by other types of coronavirus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.