Fitness landscapes1,2, depictions of how genotypes manifest at the phenotypic level, form the basis for our understanding of many areas of biology2–7 yet their properties remain elusive. Studies addressing this issue often consider specific genes and their function as proxy for fitness2,4, experimentally assessing the impact on function of single mutations and their combinations in a specific sequence2,5,8–15 or in different sequences2,3,5,16–18. However, systematic high-throughput studies of the local fitness landscape of an entire protein have not yet been reported. Here, we chart an extensive region of the local fitness landscape of the green fluorescent protein from Aequorea victoria (avGFP) by measuring the native function, fluorescence, of tens of thousands of derivative genotypes of avGFP. We find that its fitness landscape is narrow, with half of genotypes with two mutations showing reduced fluorescence and half of genotypes with five mutations being completely non-fluorescent. The narrowness is enhanced by epistasis, which was detected in up to 30% of genotypes with multiple mutations arising mostly through the cumulative impact of slightly deleterious mutations causing a threshold-like decrease of protein stability and concomitant loss of fluorescence. A model of orthologous sequence divergence spanning hundreds of millions of years predicted the extent of epistasis in our data, indicating congruence between the fitness landscape properties at the local and global scales. The characterization of the local fitness landscape of avGFP has important implications for a number of fields including molecular evolution, population genetics and protein design.
Deep profiling of antibody and T cell-receptor repertoires by means of high-throughput sequencing has become an attractive approach for adaptive immunity studies, but its power is substantially compromised by the accumulation of PCR and sequencing errors. Here we report MIGEC (molecular identifier groups-based error correction), a strategy for high-throughput sequencing data analysis. MIGEC allows for nearly absolute error correction while fully preserving the natural diversity of complex immune repertoires.
The decrease of TCR diversity with aging has never been studied by direct methods. In this study, we combined high-throughput Illumina sequencing with unique cDNA molecular identifier technology to achieve deep and precisely normalized profiling of TCR β repertoires in 39 healthy donors aged 6–90 y. We demonstrate that TCR β diversity per 106 T cells decreases roughly linearly with age, with significant reduction already apparent by age 40. The percentage of naive T cells showed a strong correlation with measured TCR diversity and decreased linearly up to age 70. Remarkably, the oldest group (average age 82 y) was characterized by a higher percentage of naive CD4+ T cells, lower abundance of expanded clones, and increased TCR diversity compared with the previous age group (average age 62 y), suggesting the influence of age selection and association of these three related parameters with longevity. Interestingly, cross-analysis of individual TCR β repertoires revealed a set >10,000 of the most representative public TCR β clonotypes, whose abundance among the top 100,000 clones correlated with TCR diversity and decreased with aging.
The diversity, architecture, and dynamics of the TCR repertoire largely determine our ability to effectively withstand infections and malignancies with minimal mistargeting of immune responses. In this study, we have employed deep TCRβ repertoire sequencing with normalization based on unique molecular identifiers to explore the long-term dynamics of T cell immunity. We demonstrate remarkable stability of repertoire, where approximately half of all T cells in peripheral blood are represented by clones that persist and generally preserve their frequencies for 3 y. We further characterize the extremes of lifelong TCR repertoire evolution, analyzing samples ranging from umbilical cord blood to centenarian peripheral blood. We show that the fetal TCR repertoire, albeit structurally maintained within regulated borders due to the lower numbers of randomly added nucleotides, is not limited with respect to observed functional diversity. We reveal decreased efficiency of nonsense-mediated mRNA decay in umbilical cord blood, which may reflect specific regulatory mechanisms in development. Furthermore, we demonstrate that human TCR repertoires are functionally more similar at birth but diverge during life, and we track the lifelong behavior of CMV- and EBV-specific T cell clonotypes. Finally, we reveal gender differences in dynamics of TCR diversity constriction, which come to naught in the oldest age. Based on our data, we propose a more general explanation for the previous observations on the relationships between longevity and immunity.
High-throughput sequencing analysis of hypermutating immunoglobulin (IG) repertoires remains a challenging task. Here we present a robust protocol for the full-length profiling of human and mouse IG repertoires. This protocol uses unique molecular identifiers (UMIs) introduced in the course of cDNA synthesis to control bottlenecks and to eliminate PCR and sequencing errors. Using asymmetric 400+100-nt paired-end Illumina sequencing and UMI-based assembly with the new version of the MIGEC software, the protocol allows up to 750-nt lengths to be sequenced in an almost error-free manner. This sequencing approach should also be applicable to various tasks beyond immune repertoire studies. In IG profiling, the achieved length of high-quality sequence covers the variable region of even the longest chains, along with the fragment of a constant region carrying information on the antibody isotype. The whole protocol, including preparation of cells and libraries, sequencing and data analysis, takes 5 to 6 d.
BackgroundThe Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is required for the rational analysis of massive data generated by next-generation sequencing.ResultsHere we introduce tcR, a new R package, representing a platform for the advanced analysis of T cell receptor repertoires, which includes diversity measures, shared T cell receptor sequences identification, gene usage statistics computation and other widely used methods. The tool has proven its utility in recent research studies.ConclusionstcR is an R package for the advanced analysis of T cell receptor repertoires after primary TR sequences extraction from raw sequencing reads. The stable version can be directly installed from The Comprehensive R Archive Network (http://cran.r-project.org/mirrors.html). The source code and development version are available at tcR GitHub (http://imminfo.github.io/tcr/) along with the full documentation and typical usage examples.
SignificanceT cells play a key role in the adaptive immune system. The broad repertoire of unique receptors expressed by T cells is in principle able to recognize a huge diversity of pathogens, but how to extract that information from blood samples remains unclear. By sequencing and analyzing the statistics of T cell receptors of subjects vaccinated against yellow fever, we identified vaccine-specific receptors that expanded following vaccination. We show that each individual has a unique response, which is similar yet across subjects in its sequence composition, with a slightly higher similarity between twins. Our method can be used in the clinic to track disease-specific T cell clones expanding or contracting after infection, vaccination, or therapy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.