ABSTACTThe mutation pattern of severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) is constantly changing with the places of transmission, but the reason remains to be revealed. Here, we presented the study that comprehensively analyzed the potential selective pressure of immune system restriction, which can drive mutations in circulating SARS-CoV-2 isolates. The results showed that the most common mutation sites of SARS-CoV-2 proteins were located on the non-structural protein ORF1ab and the structural protein Spike. Further analysis revealed mutations in cross-reactive epitopes between SARS-CoV-2 and seasonal coronavirus may help SARS-CoV-2 to escape cellular immunity under the long-term and large-scale community transmission. Meanwhile, the mutations on Spike protein may enhance the ability of SARS-CoV-2 to enter the host cells and escape the recognition of B-cell immunity. This study will increase the understanding of the evolutionary direction and warn about the potential immune escape ability of SARS-CoV-2, which may provide important guidance for the potential vaccine design.
Results
Workflow of mutation pattern analysis for SARS-CoV-2.Till now, the worldwide community spreading of SARS-CoV-2 could lead to the monitoring and responding of human immune system including HLA mediated cellular immunity and B-cell receptor (BCR) mediated humoral immunity. Consistently, to escape the acquired immune system, the virus will be mutated under the selective pressure. For cellular immunity, both HLA-I and HLA-II molecules presenting multiple alleles in different ethnicities. The diversity of alleles will lead to the presentation of different peptides. Thus, the HLA mediated selective pressure is closely related to the spreading among various human races. By taking benefit from the worldwide spreading of SARS-CoV-2, it is a great opportunity for us to reveal the immune system mediated selective pressure and the potential evolutionary direction of SARS-CoV-2.Here, we provided a comprehensive analysis in four levels including: 1) mapping the mutations on the whole genome sequence of SARS-CoV-2 and deriving all potential T-cell epitopes (PTEs) involving mutation sites (Figure 1a), 2) analyzing the potential peptides based on the circulating regions of viruses worldwide and the local dominant alleles (Figure 1b), 3) revealing the selective pressure of HLA through cross-reactive peptides (CRPs) between seasonal HCoVs and SARS-CoV-2 (Figure 1c), and 4) evaluating the binding affinity of S protein mutants against human ACE2 and binding antibody (Figure 1d). Results indicated that: 1) the mutations were occurred in the whole genome of SARS-CoV-2 including all structure and non-structure proteins, the most frequent mutation sites are in the structure protein of Spike (S) and the non-structure protein of ORF1ab, 2) the frequent mutation sites and strains were discovered in countries such as the United States, the United Kingdom, which have suffered from long-term and large-scale community transmissions, 3) the CRPs bet...