Current sequencing methods allow for detailed samples of T cell receptors (TCR) repertoires. To determine from a repertoire whether its host had been exposed to a target, computational tools that predict TCR-epitope binding are required. Currents tools are based on conserved motifs and are applied to peptides with many known binding TCRs. We employ new Natural Language Processing (NLP) based methods to predict whether any TCR and peptide bind. We combined large-scale TCR-peptide dictionaries with deep learning methods to produce ERGO (pEptide tcR matchinG predictiOn), a highly specific and generic TCR-peptide binding predictor. A set of standard tests are defined for the performance of peptide-TCR binding, including the detection of TCRs binding to a given peptide/antigen, choosing among a set of candidate peptides for a given TCR and determining whether any pair of TCR-peptide bind. ERGO reaches similar results to state of the art methods in these tests even when not trained specifically for each test. The software implementation and data sets are available at https://github.com/louzounlab/ERGO . ERGO is also available through a webserver at: http://tcr.cs.biu.ac.il/ .
Background Immune-mediated protection is mediated by T cells expressing pathogen-specific T cell antigen receptors (TCR) that are maintained at diverse sites of infection as tissue-resident memory T cells (TRM) or that disseminate as circulating effector-memory (TEM), central memory (TCM), or terminal effector (TEMRA) subsets in blood and tissues. The relationship between circulating and tissue resident T cell subsets in humans remains elusive, and is important for promoting site-specific protective immunity. Methods We analyzed the TCR repertoire of the major memory CD4+ and CD8+T cell subsets (TEM, TCM, TEMRA, and TRM) isolated from blood and/or lymphoid organs (spleen, lymph nodes, bone marrow) and lungs of nine organ donors, and blood of three living individuals spanning five decades of life. High-throughput sequencing of the variable (V) portion of individual TCR genes for each subset, tissue, and individual were analyzed for clonal diversity, expansion and overlap between lineage, T cell subsets, and anatomic sites. TCR repertoires were further analyzed for TRBV gene usage and CDR3 edit distance. Results Across blood, lymphoid organs, and lungs, human memory, and effector CD8+T cells exhibit greater clonal expansion and distinct TRBV usage compared to CD4+T cell subsets. Extensive sharing of clones between tissues was observed for CD8+T cells; large clones specific to TEMRA cells were present in all sites, while TEM cells contained clones shared between sites and with TRM. For CD4+T cells, TEM clones exhibited the most sharing between sites, followed by TRM, while TCM clones were diverse with minimal sharing between sites and subsets. Within sites, TRM clones exhibited tissue-specific expansions, and maintained clonal diversity with age, compared to age-associated clonal expansions in circulating memory subsets. Edit distance analysis revealed tissue-specific biases in clonal similarity. Conclusions Our results show that the human memory T cell repertoire comprises clones which persist across sites and subsets, along with clones that are more restricted to certain subsets and/or tissue sites. We also provide evidence that the tissue plays a key role in maintaining memory T cells over age, bolstering the rationale for site-specific targeting of memory reservoirs in vaccines and immunotherapies.
Recent advances in T cell repertoire (TCR) sequencing allow for the characterization of repertoire properties, as well as the frequency and sharing of specific TCR. However, there is no efficient measure for the local density of a given TCR. TCRs are often described either through their Complementary Determining region 3 (CDR3) sequences, or theirV/J usage, or their clone size. We here show that the local repertoire density can be estimated using a combined representation of these components through distance conserving autoencoders and Kernel Density Estimates (KDE). We present ELATE–an Encoder-based LocAl Tcr dEnsity and show that the resulting density of a sample can be used as a novel measure to study repertoire properties. The cross-density between two samples can be used as a similarity matrix to fully characterize samples from the same host. Finally, the same projection in combination with machine learning algorithms can be used to predict TCR-peptide binding through the local density of known TCRs binding a specific target.
One Sentence Summary: The combination of advanced tools from natural language processing and large-scale dictionaries of T cell receptors and their target peptide precisely predicts whether a T cell would bind a specific target. AbstractThe T cell repertoire is composed of T cell receptors (TCR) selected by their cognate MHCpeptides and naive TCR that do not bind known peptides. While the task of distinguishing a peptide-binding TCR from a naive TCR unlikely to bind any peptide can be performed using sequence motifs, distinguishing between TCRs binding different peptides requires more advanced methods. Such a prediction is the key for using TCR repertoires as disease-specific biomarkers. We here used large scale TCR-peptide dictionaries with state-of-the-art natural language processing (NLP) methods to produce ERGO (pEptide tcR matchinG predictiOn), a highly specific classifier to predict which TCR binds to which peptide. We successfully employed ERGO for two related tasks: discrimination between peptide binding and naive TCRs and the more complicated task of distinguishing between TCRs that bind different peptides. We show that ERGO significantly outperforms all current methods for classification of TCRs that bind peptides, but more importantly can distinguish the specific target of a TCR among a large set of peptides. The software implementation and data sets are available at: https://github.com/IdoSpringer/ERGO
Restoration of T-cell repertoire diversity after allogenic bone marrow transplantation (allo-BMT) is crucial for immune recovery. T-cell diversity is produced by rearrangements of germline gene segments (V (D) and J) of the T-cell receptor (TCR) α and β chains. During segment joining, nucleotide inserted and deleted at the junctions between pairs of rearranging genes form the complementarity determining region 3 (CDR3).We used a new means of comparing multiple T-cell repertoires to follow T cell repertoire changes post allo-BMT in HLA-matched related donor and recipient pairs. Our analyses of the differences between donor and recipient CDR3 beta composition and V-gene profile show that a duality exists between the reconstitution of these two following transplantation; while CDR3 sequences were consistently influenced by recipients' parameters, V genes followed a time-dependent pattern, as they got the donor profile following transplant and then shifted back to the recipients' profile. The final 2 long term repertoire was more similar to that of the recipient's original one than the donor's; some recipients converged within months while others took multiple years.Based on the results of our analyses, we propose that donor-recipient V-gene distribution differences may serve as clinical biomarkers for monitoring immune recovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.