We resequenced 876 short fragments in a sample of 96 individuals of Arabidopsis thaliana that included stock center accessions as well as a hierarchical sample from natural populations. Although A. thaliana is a selfing weed, the pattern of polymorphism in general agrees with what is expected for a widely distributed, sexually reproducing species. Linkage disequilibrium decays rapidly, within 50 kb. Variation is shared worldwide, although population structure and isolation by distance are evident. The data fail to fit standard neutral models in several ways. There is a genome-wide excess of rare alleles, at least partially due to selection. There is too much variation between genomic regions in the level of polymorphism. The local level of polymorphism is negatively correlated with gene density and positively correlated with segmental duplications. Because the data do not fit theoretical null distributions, attempts to infer natural selection from polymorphism data will require genome-wide surveys of polymorphism in order to identify anomalous regions. Despite this, our data support the utility of A. thaliana as a model for evolutionary functional genomics.
Akin to a 'Trojan horse,' APOBEC3G DNA deaminase is encapsulated by the HIV virion. APOBEC3G facilitates restriction of HIV-1 infection in T cells by deaminating cytosines in nascent minus-strand complementary DNA. Here, we investigate the biochemical basis for C --> U targeting. We observe that APOBEC3G binds randomly to single-stranded DNA, then jumps and slides processively to deaminate target motifs. When confronting partially double-stranded DNA, to which APOBEC3G cannot bind, sliding is lost but jumping is retained. APOBEC3G shows catalytic orientational specificity such that deamination occurs predominantly 3' --> 5' without requiring hydrolysis of a nucleotide cofactor. Our data suggest that the G --> A mutational gradient generated in viral genomic DNA in vivo could result from an intrinsic processive directional attack by APOBEC3G on single-stranded cDNA.
The frequency of the most common sporadic Apert syndrome mutation (C755G) in the human fibroblast growth factor receptor 2 gene (FGFR2) is 100–1,000 times higher than expected from average nucleotide substitution rates based on evolutionary studies and the incidence of human genetic diseases. To determine if this increased frequency was due to the nucleotide site having the properties of a mutation hot spot, or some other explanation, we developed a new experimental approach. We examined the spatial distribution of the frequency of the C755G mutation in the germline by dividing four testes from two normal individuals each into several hundred pieces, and, using a highly sensitive PCR assay, we measured the mutation frequency of each piece. We discovered that each testis was characterized by rare foci with mutation frequencies 103 to >104 times higher than the rest of the testis regions. Using a model based on what is known about human germline development forced us to reject (p < 10−6) the idea that the C755G mutation arises more frequently because this nucleotide simply has a higher than average mutation rate (hot spot model). This is true regardless of whether mutation is dependent or independent of cell division. An alternate model was examined where positive selection acts on adult self-renewing Ap spermatogonial cells (SrAp) carrying this mutation such that, instead of only replacing themselves, they occasionally produce two SrAp cells. This model could not be rejected given our observed data. Unlike the disease site, similar analysis of C-to-G mutations at a control nucleotide site in one testis pair failed to find any foci with high mutation frequencies. The rejection of the hot spot model and lack of rejection of a selection model for the C755G mutation, along with other data, provides strong support for the proposal that positive selection in the testis can act to increase the frequency of premeiotic germ cells carrying a mutation deleterious to an offspring, thereby unfavorably altering the mutational load in humans. Studying the anatomical distribution of germline mutations can provide new insights into genetic disease and evolutionary change.
The synthesis of high affinity antibodies requires activation-induced cytidine deaminase (AID) to initiate somatic hypermutation and class-switch recombination. Here we investigate AID-catalyzed deamination of C 3 U on single-stranded DNA and on actively transcribed closed circular double-stranded DNA. Mutations are initially favored at canonical WRC (W ؍ A or T, R ؍ A or G) somatic hypermutation hot spot motifs, but over time mutations at neighboring non-hot spot sites increase creating random clusters of mutated regions in a seemingly processive manner. N-terminal AID mutants R35E and R35E/R36D appear less processive and have altered mutational specificity compared with wild type AID. In contrast, a C-terminal deletion mutant defective in CSR in vivo closely resembles wild type AID. A mutational spectrum generated during transcription of closed circular double-stranded DNA indicates that wild type AID retains its specificity for WRC hot spot motifs within the confines of a moving transcription bubble while introducing clusters of multiple deaminations predominantly on the nontranscribed strand. AID1 is required for the secondary Ig gene diversification processes, somatic hypermutation (SHM) and class-switch recognition (CSR) (1). SHM and CSR are abolished in mice and humans (HIGM-2 patients) deficient for AID (1, 2). SHM entails the generation of point mutations in the V region of Ig genes at about million-fold higher rate than the rest of the genome, whereas CSR performs recombinational events with downstream sequences that delete the intervening constant regions (3). The two processes help to generate high affinity antibodies with an optimized fit to an antigen (V gene SHM) along with specialized Ig isotypes (CSR) enabling a more efficient clearing of systemic infections, and both require active transcription (4 -6). Although AID is synthesized in B cells selectively under tight regulation, the ectopic expression of AID can sufficiently induce SHM and CSR in non-B cell lines and can induce C⅐G 3 T⅐A mutations in uracil glycosylase-deficient Escherichia coli and Saccharomyces cerevisiae (7-11).Based on its similarity in sequence to the apoB mRNA processing enzyme Apobec-1, AID was initially thought to use RNA as its substrate (3, 12). However, biochemical assays using partially purified AID (13-16) and in vivo studies with overexpressed AID (10, 17) offer convincing documentation that ssDNA is a substrate for AID. AID appears to be inactive on RNA, dsDNA, and RNA-DNA hybrid molecules in vitro (13), but it cannot be ruled out that AID might be active on RNA in vivo (18). AID simulates hallmark properties of SHM when acting alone on naked DNA in vitro (14, 19), especially by targeting canonical WRC (20) V gene mutational hot spots (W ϭ A or T; R ϭ A or G) preferentially, while avoiding SYC cold spots (S ϭ G or C; Y ϭ C or T) (14). AID-catalyzed C deaminations in vitro exhibit broad clonal mutagenic heterogeneity (14) reminiscent of Ig V gene mutational distributions (21,22). Transcription could enable AID t...
Recent studies have shown that the human genome has a haplotype block structure, such that it can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of single-nucleotide polymorphisms (SNPs), referred to as "tag SNPs," can be used to distinguish a large fraction of the haplotypes. These tag SNPs can potentially be extremely useful for association studies, in that it may not be necessary to genotype all SNPs; however, this depends on how much power is lost. Here we develop a simulation study to quantitatively assess the power loss for a variety of study designs, including case-control designs and case-parental control designs. First, a number of data sets containing case-parental or case-control samples are generated on the basis of a disease model. Second, a small fraction of case and control individuals in each data set are genotyped at all the loci, and a dynamic programming algorithm is used to determine the haplotype blocks and the tag SNPs based on the genotypes of the sampled individuals. Third, the statistical power of tests was evaluated on the basis of three kinds of data: (1) all of the SNPs and the corresponding haplotypes, (2) the tag SNPs and the corresponding haplotypes, and (3) the same number of randomly chosen SNPs as the number of tag SNPs and the corresponding haplotypes. We study the power of different association tests with a variety of disease models and block-partitioning criteria. Our study indicates that the genotyping efforts can be significantly reduced by the tag SNPs, without much loss of power. Depending on the specific haplotype block-partitioning algorithm and the disease model, when the identified tag SNPs are only 25% of all the SNPs, the power is reduced by only 4%, on average, compared with a power loss of approximately 12% when the same number of randomly chosen SNPs is used in a two-locus haplotype analysis. When the identified tag SNPs are approximately 14% of all the SNPs, the power is reduced by approximately 9%, compared with a power loss of approximately 21% when the same number of randomly chosen SNPs is used in a two-locus haplotype analysis. Our study also indicates that haplotype-based analysis can be much more powerful than marker-by-marker analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.