The soybean is agro-economically the most important among all cultivated legume crops, and its seed color is considered one of the most attractive factors in the selection-by-breeders. Thus, genome-wide identification of genes and loci associated with seed colors is critical for the precision breeding of crop soybeans. To dissect seed pigmentation-associated genomic loci and genes, we employed dual approaches by combining reference-based genome-wide association study (rbGWAS) and k-mer-based reference-free GWAS (rfGWAS) with 438 Glycine accessions. The dual analytical strategy allowed us to identify four major genomic loci (designated as SP1-SP4 in this study) associated with the seed colors of soybeans. The k-mer analysis enabled us to find an important recombination event that occurred between subtilisin and I-cluster B in the soybean genome, which could describe a special structural feature of ii allele within the I locus (SP3). Importantly, mapping analyses of both mRNAs and small RNAs allowed us to reveal that the subtilisin-CHS1/CHS3 chimeric transcripts generate and act as an initiator towards ‘mirtron (i.e., intron-harboring miRNA precursor)’-triggered silencing of chalcone synthase (CHS) genes. Consequently, the results led us to propose a working model of ‘mirtron-triggered gene silencing (MTGS)’ to elucidate a long-standing puzzle in the genome-wide CHS gene silencing mechanism. In summary, our study reports four major genomic loci, lists of key genes and genome-wide variations that are associated with seed pigmentation in soybeans. In addition, we propose that the MTGS mechanism plays a crucial role in the genome-wide silencing of CHS genes, thereby suggesting a clue to currently predominant soybean cultivars with the yellow seed coat. Finally, this study will provide a broad insight into the interactions and correlations among seed color-associated genes and loci within the context of anthocyanin biosynthetic pathways.
BackgroundGenetic markers are tools that can facilitate molecular breeding, even in species lacking genomic resources. An important class of genetic markers is those based on orthologous genes, because they can guide hypotheses about conserved gene function, a situation that is well documented for a number of agronomic traits. For under-studied species a key bottleneck in gene-based marker development is the need to develop molecular tools (e.g., oligonucleotide primers) that reliably access genes with orthology to the genomes of well-characterized reference species.ResultsHere we report an efficient platform for the design of cross-species gene-derived markers in legumes. The automated platform, named CSGM Designer (URL: http://tgil.donga.ac.kr/CSGMdesigner), facilitates rapid and systematic design of cross-species genic markers. The underlying database is composed of genome data from five legume species whose genomes are substantially characterized. Use of CSGM is enhanced by graphical displays of query results, which we describe as “circular viewer” and “search-within-results” functions. CSGM provides a virtual PCR representation (eHT-PCR) that predicts the specificity of each primer pair simultaneously in multiple genomes. CSGM Designer output was experimentally validated for the amplification of orthologous genes using 16 genotypes representing 12 crop and model legume species, distributed among the galegoid and phaseoloid clades. Successful cross-species amplification was obtained for 85.3% of PCR primer combinations.ConclusionCSGM Designer spans the divide between well-characterized crop and model legume species and their less well-characterized relatives. The outcome is PCR primers that target highly conserved genes for polymorphism discovery, enabling functional inferences and ultimately facilitating trait-associated molecular breeding.Electronic supplementary materialThe online version of this article (doi:10.1186/s13007-015-0074-6) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.