The major histocompatibility complex (MHC) class-I pathway supports the detection of cancer and viruses by the immune system. It presents parts of proteins (peptides) from inside a cell on its membrane surface enabling visiting immune cells that detect non-self peptides to terminate the cell. The ability to predict whether a peptide will get presented on MHC Class I molecules helps in designing vaccines so they can activate the immune system to destroy the invading disease protein.We designed a prediction model using a BERT-based architecture (ImmunoBERT) that takes as input a peptide and its surrounding regions (N and C-terminals) along with a set of MHC class I (MHC-I) molecules. We present a novel application of well known interpretability techniques, SHAP and LIME, to this domain and we use these results along with 3D structure visualizations and amino acid frequencies to understand and identify the most influential parts of the input amino acid sequences contributing to the output. In particular, we find that amino acids close to the peptides' N-and C-terminals are highly relevant. Additionally, some positions within the MHC proteins (in particular in the A, B and F pockets) are often assigned a high importance ranking -which confirms biological studies and the distances in the structure visualizations. The source code can be found on https://github.com/hcgasser/ImmunoBERT. * jointly supervised 1st Workshop on eXplainable AI approaches for debugging and diagnosis (XAI4Debugging@NeurIPS2021).
Tumor antigens can emerge through multiple mechanisms, including translation of non-coding genomic regions. This non-canonical category of antigens has recently gained attention; however, our understanding of how they recur within and between cancer types is still in its infancy. Therefore, we developed a proteogenomic pipeline based on deep learning de novo mass spectrometry to enable the discovery of non-canonical MHC-associated peptides (ncMAPs) from non-coding regions. Considering that the emergence of tumor antigens can also involve post-translational modifications, we included an open search component in our pipeline. Leveraging the wealth of mass spectrometry-based immunopeptidomics, we analyzed 26 MHC class I immunopeptidomic studies of 9 different cancer types. We validated the de novo identified ncMAPs, along with the most abundant post-translational modifications, using spectral matching and controlled their false discovery rate (FDR) to 1%. Interestingly, the non-canonical presentation appeared to be 5 times enriched for the A03 HLA supertype, with a projected population coverage of 54.85%. Here, we reveal an atlas of 8,601 ncMAPs with varying levels of cancer selectivity and suggest 17 cancer-selective ncMAPs as attractive targets according to a stringent cutoff. In summary, the combination of the open-source pipeline and the atlas of ncMAPs reported herein could facilitate the identification and screening of ncMAPs as targeting agents for T-cell therapies or vaccine development.
At the end of 2013 the real yields that the UK government had to pay on its debt were negative over the whole curve. Several possible explanations are available for this phenomenon – central bank action, regulatory changes, demographic developments and economic conditions. The first two can result from deliberate interaction by the state into the financial markets and can be labelled as financial repression. We explain the historic precedents for Governments to use financial repression to manage their debt, look into the influence of regulation on asset allocation for insurers and pension funds, and introduce the concept of a balance-sheet recession.
<p>Comparison of COD-dipp ncMAPs with other studies. Because the COD-dipp ncMAPs are restricted to the 3-frame translation (3FT) of protein-coding genes, sequences from the literature were aligned to the same 3FT database for comparison purposes. The intersection is based on genomic coordinates to deal with sequences that partially match (i.e., longer, shorter, or partially overlapping). Because the Venn is generated by overlapping genomic coordinates of the ncMAPs, the original counts for each study are listed from left to right (i.e., on the right-hand side of panel C, the notation 29/41 refers to 29 instances for Chong and colleagues 2020 and 41 for COD-dipp). <b>A,</b> Comparison with peptide-PRISM published ncMAPs at a 10% FDR. COD-dipp ncMAPs were restricted to 3 studies in common with Erhard and colleagues 2020. <b>B,</b> Comparison with peptide-PRISM published ncMAPs at a 1% FDR. COD-dipp ncMAPs were restricted to 3 studies in common with Erhard and colleagues 2020. <b>C,</b> Comparison of the atlas of ncMAPs revealed by COD-dipp to 3 previous studies.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.