This is a PDF file of a peer-reviewed paper that has been accepted for publication. Although unedited, the content has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text and figures will undergo copyediting and a proof review before the paper is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.
This is a PDF file of a peer-reviewed paper that has been accepted for publication. Although unedited, the content has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text and figures will undergo copyediting and a proof review before the paper is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.
SARS-CoV-2 is a coronavirus responsible for the COVID-19 pandemic. In order to understand its pathogenicity, antigenic potential and to develop diagnostic and therapeutic tools, it is essential to portray the full repertoire of its expressed proteins. The SARS-CoV-2 coding capacity map is currently based on computational predictions and relies on homology to other coronaviruses. Since coronaviruses differ in their protein array, especially in the variety of accessory proteins, it is crucial to characterize the specific collection of SARS-CoV-2 translated open reading frames (ORF)s in an unbiased and open-ended manner. Utilizing a suit of ribosome profiling techniques, we present a high-resolution map of the SARS-CoV-2 coding regions, allowing us to accurately quantify the expression of canonical viral ORFs and to identify 23 novel unannotated viral ORFs. These ORFs include several in-frame internal ORFs lying within existing ORFs, resulting in N-terminally truncated products, as well as internal out-of-frameORFs, which generate novel polypeptides. Finally, we detected a prominent initiation at a CUG codon located in the 5'UTR. Although this codon is shared by all SARS-CoV-2 transcripts, the initiation was specific to the genomic RNA, indicating that the genomic RNA harbors unique features that may affect ribosome engagement. Overall, our work reveals the full coding capacity of SARS-CoV-2 genome, providing a rich resource, which will form the basis of future functional studies and diagnostic efforts.
T cell-mediated immunity plays an important role in controlling SARS-CoV-2 infection, but the repertoire of naturally processed and presented viral epitopes on class I human leukocyte antigen (HLA-I) remains uncharacterized. Here, we report the first HLA-I immunopeptidome of SARS-CoV-2 in two cell lines at different times post infection using mass spectrometry. We found HLA-I peptides derived not only from canonical open reading frames (ORFs) but also from internal out-of-frame ORFs in spike and nucleocapsid not captured by current vaccines. Some peptides from out-of-frame ORFs elicited T cell responses in a humanized mouse model and individuals with COVID-19 that exceeded responses to canonical peptides, including some of the strongest epitopes reported to date. Whole-proteome analysis of infected cells revealed that early expressed viral proteins contribute more to HLA-I presentation and immunogenicity. These biological insights, as well as the discovery of out-of-frame ORF epitopes, will facilitate selection of peptides for immune monitoring and vaccine development.
At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the ICTV. We recommend calling the 39 codon Spike-overlapping ORF ORF2b; the 41, 57, and 22 codon ORF3a-overlapping ORFs ORF3c, ORF3d, and ORF3b; the 33 codon ORF3d isoform ORF3d-2; and the 97 and 73 codon Nucleocapsid-overlapping ORFs ORF9b and ORF9c. Finally, we document conflicting usage of the name ORF3b in 32 studies, and consequent erroneous inferences, stressing the importance of reserving identical names for homologs. We recommend that authors referring to these ORFs provide lengths and coordinates to minimize ambiguity due to prior usage of alternative names.
Human herpesvirus-6 (HHV-6) A and B are ubiquitous betaherpesviruses, infecting the majority of the human population. They encompass large genomes and our understanding of their protein coding potential is far from complete. Here, we employ ribosome-profiling and systematic transcript-analysis to experimentally define HHV-6 translation products. We identify hundreds of new open reading frames (ORFs), including upstream ORFs (uORFs) and internal ORFs (iORFs), generating a complete unbiased atlas of HHV-6 proteome. By integrating systematic data from the prototypic betaherpesvirus, human cytomegalovirus, we uncover numerous uORFs and iORFs conserved across betaherpesviruses and we show uORFs are enriched in late viral genes. We identified three highly abundant HHV-6 encoded long non-coding RNAs, one of which generates a non-polyadenylated stable intron appearing to be a conserved feature of betaherpesviruses. Overall, our work reveals the complexity of HHV-6 genomes and highlights novel features conserved between betaherpesviruses, providing a rich resource for future functional studies.
T cell-mediated immunity may play a critical role in controlling and establishing protective immunity against SARS-CoV-2 infection; yet the repertoire of viral epitopes responsible for T cell response activation remains mostly unknown. Identification of viral peptides presented on class I human leukocyte antigen (HLA-I) can reveal epitopes for recognition by cytotoxic T cells and potential incorporation into vaccines. Here, we report the first HLA-I immunopeptidome of SARS-CoV-2 in two human cell lines at different times post-infection using mass spectrometry. We found HLA-I peptides derived not only from canonical ORFs, but also from internal out-of-frame ORFs in Spike and Nucleoprotein not captured by current vaccines. Proteomics analyses of infected cells revealed that SARS-CoV-2 may interfere with antigen processing and immune signaling pathways. Based on the endogenously processed and presented viral peptides that we identified, we estimate that a pool of 24 peptides would provide one or more peptides for presentation by at least one HLA allele in 99% of the human population. These biological insights and the list of naturally presented SARS-CoV-2 peptides will facilitate data-driven selection of peptides for immune monitoring and vaccine development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.