Whole-genome duplications have occurred in the recent ancestors of many plants, fish, and amphibians, resulting in a pervasiveness of paralogous loci and the potential for both disomic and tetrasomic inheritance in the same genome. Paralogs can be difficult to reliably genotype and are often excluded from genotyping-by-sequencing (GBS) analyses; however, removal requires paralogs to be identified which is difficult without a reference genome. We present a method for identifying paralogs in natural populations by combining two properties of duplicated loci: (i) the expected frequency of heterozygotes exceeds that for singleton loci, and (ii) within heterozygotes, observed read ratios for each allele in GBS data will deviate from the 1:1 expected for singleton (diploid) loci. These deviations are often not apparent within individuals, particularly when sequence coverage is low; but, we postulated that summing allele reads for each locus over all heterozygous individuals in a population would provide sufficient power to detect deviations at those loci. We identified paralogous loci in three species: Chinook salmon (Oncorhynchus tshawytscha) which retains regions with ongoing residual tetrasomy on eight chromosome arms following a recent whole-genome duplication, mountain barberry (Berberis alpina) which has a large proportion of paralogs that arose through an unknown mechanism, and dusky parrotfish (Scarus niger) which has largely rediploidized following an ancient whole-genome duplication. Importantly, this approach only requires the genotype and allele-specific read counts for each individual, information which is readily obtained from most GBS analysis pipelines.
Comparisons between the genomes of salmon species reveal that they underwent extensive chromosomal rearrangements following whole genome duplication that occurred in their lineage 58−63 million years ago. Extant salmonids are diploid, but occasional pairing between homeologous chromosomes exists in males. The consequences of re-diploidization can be characterized by mapping the position of duplicated loci in such species. Linkage maps are also a valuable tool for genome-wide applications such as genome-wide association studies, quantitative trait loci mapping or genome scans. Here, we investigated chromosomal evolution in Chinook salmon (Oncorhynchus tshawytscha) after genome duplication by mapping 7146 restriction-site associated DNA loci in gynogenetic haploid, gynogenetic diploid, and diploid crosses. In the process, we developed a reference database of restriction-site associated DNA loci for Chinook salmon comprising 48528 non-duplicated loci and 6409 known duplicated loci, which will facilitate locus identification and data sharing. We created a very dense linkage map anchored to all 34 chromosomes for the species, and all arms were identified through centromere mapping. The map positions of 799 duplicated loci revealed that homeologous pairs have diverged at different rates following whole genome duplication, and that degree of differentiation along arms was variable. Many of the homeologous pairs with high numbers of duplicated markers appear conserved with other salmon species, suggesting that retention of conserved homeologous pairing in some arms preceded species divergence. As chromosome arms are highly conserved across species, the major resources developed for Chinook salmon in this study are also relevant for other related species.
Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure.
Because of their high variability, microsatellites are still considered the marker of choice for studies on parentage and kinship in wild populations. Nevertheless, single nucleotide polymorphisms (SNPs) are becoming increasing popular in many areas of molecular ecology, owing to their high-throughput, easy transferability between laboratories and low genotyping error. An ongoing discussion concerns the relative power of SNPs compared to microsatellites-that is, how many SNP loci are needed to replace a panel of microsatellites? Here, we evaluate the assignment power of 80 SNPs (H(E) = 0.30, 80 independent alleles) and 11 microsatellites (H(E) = 0.85, 192 independent alleles) in a wild population of about 400 sockeye salmon with two commonly used software packages (Cervus3, Colony2) and, for SNPs only, a newly developed software (SNPPIT). Assignment success was higher for SNPs than for microsatellites, especially for parent pairs, irrespective of the method used. Colony2 assigned a larger proportion of offspring to at least one parent than the other methods, although Cervus and SNPPIT detected more parent pairs. Identification of full-sib groups without parental information from relatedness measures was possible using both marker systems, although explicit reconstruction of such groups in Colony2 was impossible for SNPs because of computation time. Our results confirm the applicability of SNPs for parentage analyses and refute the predictability of assignment success from the number of independent alleles.
Single nucleotide polymorphisms (SNPs) are a class of genetic markers that are well suited to a broad range of research and management applications. Although advances in genotyping chemistries and analysis methods continue to increase the potential advantages of using SNPs to address molecular ecological questions, the scarcity of available DNA sequence data for most species has limited marker development. As the number and diversity of species being targeted for large-scale sequencing has increased, so has the potential for using sequence from sister taxa for marker development in species of interest. We evaluated the use of Oncorhynchus mykiss and Salmo salar sequence data to identify SNPs in three other species (Oncorhynchus tshawytscha, Oncorhynchus nerka and Oncorhynchus keta). Primers designed based on O. mykiss and S. salar alignments were more successful than primers designed based on Oncorhynchus-only alignments for sequencing target species, presumably due to the much larger number of potential targets available from the former alignments and possibly greater sequence conservation in those targets. In sequencing approximately 89 kb we observed a frequency of 4.30 x 10(-3) SNPs per base pair. Approximately half (53/101) of the subsequently designed validation assays resulted in high-throughput SNP genotyping markers. We speculate that this relatively low conversion rate may reflect the duplicated nature of the salmon genome. Our results suggest that a large number of SNPs could be developed for Pacific salmon using sequence data from other species. While the costs of DNA sequencing are still significant, these must be compared to the costs of using other marker classes for a given application.
Studies of the oceanic and near-shore distributions of Pacific salmon, whose migrations typically span thousands of kilometres, have become increasingly valuable in the presence of climate change, increasing hatchery production and potentially high rates of bycatch in offshore fisheries. Genetics data offer considerable insights into both the migratory routes as well as the evolutionary histories of the species. However, these types of studies require extensive data sets from spawning populations originating from across the species' range. Single nucleotide polymorphisms (SNPs) have been particularly amenable for multinational applications because they are easily shared, require little interlaboratory standardization and can be assayed through increasingly efficient technologies. Here, we discuss the development of a data set for 114 populations of chum salmon through a collaboration among North American and Asian researchers, termed PacSNP. PacSNP is focused on developing the database and applying it to problems of international interest. A data set spanning the entire range of species provides a unique opportunity to examine patterns of variability, and we review issues associated with SNP development. We found evidence of ascertainment bias within the data set, variable linkage relationships between SNPs associated with ancestral groupings and outlier loci with alleles associated with latitude.
A whole genome duplication occurred in the ancestor of all salmonid fishes some 50-100 million years ago. Early inheritance studies with allozymes indicated that loci in the salmonid genome are inherited disomically in females. However, some pairs of duplicated loci showed patterns of inheritance in males indicating pairing and recombination between homeologous chromosomes. Nearly 20% of loci in the salmonid genome are duplicated and share the same alleles (isoloci), apparently due to homeologous recombination. Half-tetrad analysis revealed that isoloci tend to be telomeric. These results suggested that residual tetrasomic inheritance of isoloci results from homeologous recombination near chromosome ends and that continued disomic inheritance resulted from homologous pairing of centromeric regions. Many current genetic maps of salmonids are based on single nucleotide polymorphisms and microsatellites that are no longer duplicated. Therefore, long sections of chromosomes on these maps are poorly represented, especially telomeric regions. In addition, preferential multivalent pairing of homeologs from the same species in F1 hybrids results in an excess of nonparental gametes (so-called pseudolinkage). We consider how not including duplicated loci has affected our understanding of population and evolutionary genetics of salmonids, and we discuss how incorporating these loci will benefit our understanding of population genomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.