Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in EXCEL spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies.
Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co-amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra-deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500-20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag identifiers, which may confound genotyping. The reliability of MHC genotyping increased with coverage and approached or exceeded 90% within-method repeatability of allele calling at coverages of >5000 reads per amplicon. We found generally high agreement between genotyping methods, especially at high coverages. High reliability of the tested genotyping approaches was further supported by our analysis of the simulated data set, although the genotyping approach relying primarily on replication of variants in independent amplicons proved sensitive to repeatable errors. According to the most repeatable genotyping method, the number of co-amplifying variants per individual ranged from 19 to 42. Tag jumping was detectable, but at such low frequencies that it did not affect the reliability of genotyping. We thus demonstrate that gene families with many co-amplifying genes can be reliably genotyped using HTS, provided that there is sufficient per amplicon coverage.
The control of growth and development of all living organisms is a complex and dynamic process that requires the harmonious expression of numerous genes. Gene expression is mainly controlled by the activity of sequence-specific DNA binding proteins called transcription factors (TFs). Amongst the various classes of eukaryotic TFs, the MYB superfamily is one of the largest and most diverse, and it has considerably expanded in the plant kingdom. R2R3-MYBs have been extensively studied over the last 15 years. However, DNA-binding specificity has been characterized for only a small subset of these proteins. Therefore, one of the remaining challenges is the exhaustive characterization of the DNA-binding specificity of all R2R3-MYB proteins. In this study, we have developed a library of Arabidopsis thaliana R2R3-MYB open reading frames, whose DNA-binding activities were assayed in vivo (yeast one-hybrid experiments) with a pool of selected cis-regulatory elements. Altogether 1904 interactions were assayed leading to the discovery of specific patterns of interactions between the various R2R3-MYB subgroups and their DNA target sequences and to the identification of key features that govern these interactions. The present work provides a comprehensive in vivo analysis of R2R3-MYB binding activities that should help in predicting new DNA motifs and identifying new putative target genes for each member of this very large family of TFs. In a broader perspective, the generated data will help to better understand how TF interact with their target DNA sequences.
BackgroundRecent work suggests that gene duplications may play an important role in the evolution of immunity genes. Passerine birds, and in particular Sylvioidea warblers, have highly duplicated major histocompatibility complex (MHC) genes, which are key in immunity, compared to other vertebrates. However, reasons for this high MHC gene copy number are yet unclear. High-throughput sequencing (HTS) allows MHC genotyping even in individuals with extremely duplicated genes. This HTS data can reveal evidence of selection, which may help to unravel the putative functions of different gene copies, i.e. neofunctionalization. We performed exhaustive genotyping of MHC class I in a Sylvioidea warbler, the sedge warbler, Acrocephalus schoenobaenus, using the Illumina MiSeq technique on individuals from a wild study population.ResultsThe MHC diversity in 863 genotyped individuals by far exceeds that of any other bird species described to date. A single individual could carry up to 65 different alleles, a large proportion of which are expressed (transcribed). The MHC alleles were of three different lengths differing in evidence of selection, diversity and divergence within our study population. Alleles without any deletions and alleles containing a 6 bp deletion showed characteristics of classical MHC genes, with evidence of multiple sites subject to positive selection and high sequence divergence. In contrast, alleles containing a 3 bp deletion had no sites subject to positive selection and had low divergence.ConclusionsOur results suggest that sedge warbler MHC alleles that either have no deletion, or contain a 6 bp deletion, encode classical antigen presenting MHC molecules. In contrast, MHC alleles containing a 3 bp deletion may encode molecules with a different function. This study demonstrates that highly duplicated MHC genes can be characterised with HTS and that selection patterns can be useful for revealing neofunctionalization. Importantly, our results highlight the need to consider the putative function of different MHC genes in future studies of MHC in relation to disease resistance and fitness.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-017-0997-9) contains supplementary material, which is available to authorized users.
Major histocompatibility complex (MHC) genes encode proteins that initiate adaptive immune responses through the presentation of foreign antigens to T cells. The high polymorphism found at these genes, thought to be promoted and maintained by pathogen-mediated selection, contrasts with the limited number of MHC loci found in most vertebrates. Although expressing many diverse MHC genes should broaden the range of detectable pathogens, it has been hypothesized to also cause deletion of larger fractions of self-reactive T cells, leading to a detrimental reduction of the T cell receptor (TCR) repertoire. However, a key prediction of this TCR depletion hypothesis, that the TCR repertoire should be inversely related to the individual MHC diversity, has never been tested. Here, using high-throughput sequencing and advanced sequencing error correction, we provide evidence of such an association in a rodent species with high interindividual variation in the number of expressed MHC molecules, the bank vole (Myodes glareolus). Higher individual diversity of MHC class I, but not class II, was associated with smaller TCR repertoires. Our results thus provide partial support for the TCR depletion model, while also highlighting the complex, potentially MHC class-specific mechanisms by which autoreactivity may trade off against evolutionary expansion of the MHC gene family.
Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
High salinity causes remarkable losses in rice productivity worldwide mainly because it inhibits growth and reduces grain yield. To cope with environmental changes, plants evolved several adaptive mechanisms, which involve the regulation of many stressresponsive genes. Among these, we have chosen OsRMC to study its transcriptional regulation in rice seedlings subjected to high salinity. Its transcription was highly induced by salt treatment and showed a stress-dose-dependent pattern. OsRMC encodes a receptor-like kinase described as a negative regulator of salt stress responses in rice. To investigate how OsRMC is regulated in response to high salinity, a salt-induced rice cDNA expression library was constructed and subsequently screened using the yeast one-hybrid system and the OsRMC promoter as bait. Thereby, two transcription factors (TFs), OsEREBP1 and OsEREBP2, belonging to the AP2/ERF family were identified. Both TFs were shown to bind to the same GCC-like DNA motif in OsRMC promoter and to negatively regulate its gene expression. The identified TFs were characterized regarding their gene expression under different abiotic stress conditions. This study revealed that OsEREBP1 transcript level is not significantly affected by salt, ABA or severe cold (5 °C) and is only slightly regulated by drought and moderate cold. On the other hand, the OsEREBP2 transcript level increased after cold, ABA, drought and high salinity treatments, indicating that OsEREBP2 may play a central role mediating the response to different abiotic stresses. Gene expression analysis in rice varieties with contrasting salt tolerance further suggests that OsEREBP2 is involved in salt stress response in rice.
Pathogens are one of the main forces driving the evolution and maintenance of the highly polymorphic genes of the vertebrate major histocompatibility complex (MHC). Although MHC proteins are crucial in pathogen recognition, it is still poorly understood how pathogen-mediated selection promotes and maintains MHC diversity, and especially so in host species with highly duplicated MHC genes. Sedge warblers (Acrocephalus schoenobaenus) have highly duplicated MHC genes, and using data from high-throughput MHC genotyping, we were able to investigate to what extent avian malaria parasites explain temporal MHC class I supertype fluctuations in a long-term study population. We investigated infection status and infection intensities of two different strains of Haemoproteus, that is avian malaria parasites that are known to have significant fitness consequences in sedge warblers. We found that prevalence of avian malaria in carriers of specific MHC class I supertypes was a significant predictor of their frequency changes between years. This finding suggests that avian malaria infections partly drive the temporal fluctuations of the MHC class I supertypes. Furthermore, we found that individuals with a large number of different supertypes had higher resistance to avian malaria, but there was no evidence for an optimal MHC class I diversity. Thus, the two studied malaria parasite strains appear to select for a high MHC class I supertype diversity. Such selection may explain the maintenance of the extremely high number of MHC class I gene copies in sedge warblers and possibly also in other passerines where avian malaria is a common disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.