(M.J.A., M.N.) Arabidopsis (Arabidopsis thaliana) accessions provide an excellent resource to dissect the molecular basis of adaptation. We have selected 192 Arabidopsis accessions collected to represent worldwide and local variation and analyzed two adaptively important traits, flowering time and vernalization response. There was huge variation in the flowering habit of the different accessions, with no simple relationship to latitude of collection site and considerable diversity occurring within local regions. We explored the contribution to this variation from the two genes FRIGIDA (FRI) and FLOWERING LOCUS C (FLC), previously shown to be important determinants in natural variation of flowering time. A correlation of FLC expression with flowering time and vernalization was observed, but it was not as strong as anticipated due to many late-flowering/vernalizationrequiring accessions being associated with low FLC expression and early-flowering accessions with high FLC expression. Sequence analysis of FRI revealed which accessions were likely to carry functional alleles, and, from comparison of flowering time with allelic type, we estimate that approximately 70% of flowering time variation can be accounted for by allelic variation of FRI. The maintenance and propagation of 20 independent nonfunctional FRI haplotypes suggest that the loss-of-function mutations can confer a strong selective advantage. Accessions with a common FRI haplotype were, in some cases, associated with very different FLC levels and wide variation in flowering time, suggesting additional variation at FLC itself or other genes regulating FLC. These data reveal how useful these Arabidopsis accessions will be in dissecting the complex molecular variation that has led to the adaptive phenotypic variation in flowering time.
A potentially serious disadvantage of association mapping is the fact that marker-trait associations may arise from confounding population structure as well as from linkage to causative polymorphisms. Using genome-wide marker data, we have previously demonstrated that the problem can be severe in a global sample of 95 Arabidopsis thaliana accessions, and that established methods for controlling for population structure are generally insufficient. Here, we use the same sample together with a number of flowering-related phenotypes and data-perturbation simulations to evaluate a wider range of methods for controlling for population structure. We find that, in terms of reducing the false-positive rate while maintaining statistical power, a recently introduced mixed-model approach that takes genome-wide differences in relatedness into account via estimated pairwise kinship coefficients generally performs best. By combining the association results with results from linkage mapping in F2 crosses, we identify one previously known true positive and several promising new associations, but also demonstrate the existence of both false positives and false negatives. Our results illustrate the potential of genome-wide association scans as a tool for dissecting the genetics of natural variation, while at the same time highlighting the pitfalls. The importance of study design is clear; our study is severely under-powered both in terms of sample size and marker density. Our results also provide a striking demonstration of confounding by population structure. While statistical methods can be used to ameliorate this problem, they cannot always be effective and are certainly not a substitute for independent evidence, such as that obtained via crosses or transgenic experiments. Ultimately, association mapping is a powerful tool for identifying a list of candidates that is short enough to permit further genetic study.
A potentially serious disadvantage of association mapping is the fact that marker-trait associations may arise from confounding population structure as well as from linkage to causative polymorphisms. Using genome-wide marker data, we have previously demonstrated that the problem can be severe in a global sample of 95 Arabidopsis thaliana accessions, and that established methods for controlling for population structure are generally insufficient. Here, we use the same sample together with a number of flowering-related phenotypes and data-perturbation simulations to evaluate a wider range of methods for controlling for population structure. We find that, in terms of reducing the falsepositive rate while maintaining statistical power, a recently introduced mixed-model approach that takes genomewide differences in relatedness into account via estimated pairwise kinship coefficients generally performs best. By combining the association results with results from linkage mapping in F2 crosses, we identify one previously known true positive and several promising new associations, but also demonstrate the existence of both false positives and false negatives. Our results illustrate the potential of genome-wide association scans as a tool for dissecting the genetics of natural variation, while at the same time highlighting the pitfalls. The importance of study design is clear; our study is severely under-powered both in terms of sample size and marker density. Our results also provide a striking demonstration of confounding by population structure. While statistical methods can be used to ameliorate this problem, they cannot always be effective and are certainly not a substitute for independent evidence, such as that obtained via crosses or transgenic experiments. Ultimately, association mapping is a powerful tool for identifying a list of candidates that is short enough to permit further genetic study.
There is currently tremendous interest in the possibility of using genome-wide association mapping to identify genes responsible for natural variation, particularly for human disease susceptibility. The model plant Arabidopsis thaliana is in many ways an ideal candidate for such studies, because it is a highly selfing hermaphrodite. As a result, the species largely exists as a collection of naturally occurring inbred lines, or accessions, which can be genotyped once and phenotyped repeatedly. Furthermore, linkage disequilibrium in such a species will be much more extensive than in a comparable outcrossing species. We tested the feasibility of genome-wide association mapping in A. thaliana by searching for associations with flowering time and pathogen resistance in a sample of 95 accessions for which genome-wide polymorphism data were available. In spite of an extremely high rate of false positives due to population structure, we were able to identify known major genes for all phenotypes tested, thus demonstrating the potential of genome-wide association mapping in A. thaliana and other species with similar patterns of variation. The rate of false positives differed strongly between traits, with more clinal traits showing the highest rate. However, the false positive rates were always substantial regardless of the trait, highlighting the necessity of an appropriate genomic control in association studies.
We report the sequence of 41 primer pairs of microsatellites from a CT-enriched genomic library of the peach cultivar 'Merrill O'Henry'. Ten microsatellite-containing clones had sequences similar to plant coding sequences in databases and could be used as markers for known functions. For microsatellites segregating at least in one of the two Prunus F(2) progenies analyzed, it was possible to demonstrate Mendelian inheritance. Microsatellite polymorphism was evaluated in 27 peach and 21 sweet cherry cultivars. All primer pairs gave PCR-amplification products on peach and 33 on cherry (80.5%). Six PCR-amplifications revealed several loci (14.6%) in peach and eight (19.5%) in sweet cherry. Among the 33 single-locus microsatellites amplified in peach and sweet cherry, 13 revealed polymorphism both in peach and cherry, 19 were polymorphic only on peach and one was polymorphic only on cherry. The number of alleles per locus ranged from 1 to 9 for peach and from 1 to 6 on sweet cherry with an average of 4.2 and 2.8 in peach and sweet cherry, respectively. Cross-species amplification was tested within the Prunus species: Prunus avium L. (sweet cherry and mazzard), Prunus cerasus L. (sour cherry), Prunus domestica L. (European plum), Prunus amygdalus Batsch. (almond), Prunus armeniaca L. (apricot), Prunus cerasifera Ehrh. (Myrobalan plum). Plants from other genera of the Rosaceae were also tested: Malus (apple) and Fragaria (strawberry), as well as species not belonging to the Rosaceae: Castanea (chestnut tree), Juglans (walnut tree) and Vitis (grapevine). Six microsatellites gave amplification on all the tested species. Among them, one had an amplified region homologous to sequences encoding a MADS-box protein in Malus x domestica. Twelve microsatellites (29.3%) were amplified in all the Rosaceae species tested and 31 (75.6%) were amplified in all the six Prunus species tested. Thirty three (80.5%), 18 (43.9%) and 13 (31.7%) gave amplification on chestnut tree, grapevine and walnut tree, respectively.
Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs.The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species.
A set of 109 microsatellite primer pairs recently developed for peach and cherry have been studied in the almond x peach F(2) progeny previously used to construct a saturated Prunus map containing mainly restriction fragment length polymorphism markers. All but one gave amplification products, and 87 (80%) segregated in the progeny and detected 96 loci. The resulting Prunus map contains a total of 342 markers covering a total distance of 522 cM. The approximate position of nine additional simple sequence repeats (SSRs) was established by comparison with other almond and peach maps. SSRs were placed in all the eight linkage groups of this map, and their distribution was relatively even, providing a genome-wide coverage with an average density of 5.4 cM/SSR. Twenty-four single-locus SSRs, highly polymorphic in peach, and each falling within 24 evenly spaced approximately 25-cM regions covering the whole Prunus genome, are proposed as a 'genotyping set' useful as a reference for fingerprinting, pedigree and genetic analysis of this species.
The detection of footprints of natural selection in genetic polymorphism data is fundamental to understanding the genetic basis of adaptation, and has important implications for human health. The standard approach has been to reject neutrality in favor of selection if the pattern of variation at a candidate locus was significantly different from the predictions of the standard neutral model. The problem is that the standard neutral model assumes more than just neutrality, and it is almost always possible to explain the data using an alternative neutral model with more complex demography. Today's wealth of genomic polymorphism data, however, makes it possible to dispense with models altogether by simply comparing the pattern observed at a candidate locus to the genomic pattern, and rejecting neutrality if the pattern is extreme. Here, we utilize this approach on a truly genomic scale, comparing a candidate locus to thousands of alleles throughout the Arabidopsis thaliana genome. We demonstrate that selection has acted to increase the frequency of early-flowering alleles at the vernalization requirement locus FRIGIDA. Selection seems to have occurred during the last several thousand years, possibly in response to the spread of agriculture. We introduce a novel test statistic based on haplotype sharing that embraces the problem of population structure, and so should be widely applicable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.