Advances in next-generation sequencing methods and the development of new statistical and computational methods have opened up possibilities made for large-scale, high quality genotyping in most organisms. Conifer genomes are large and are known to contain a high fraction of repetitive elements and this complex genome structure has bearings for approaches that aim to use next-generation sequencing methods for genotyping. In this chapter we provide a detailed description of a workflow for variant calling using next-generation sequencing in Norway spruce ( Picea abies ). The workflow that starts with raw sequencing reads and proceeds through read mapping to variant calling and variant filtering. We illustrate the pipeline using data derived from both whole-genome resequencing data and reduced-representation sequencing. We highlight possible problems and pitfalls of using next-generation sequencing data for genotyping stemming from the complex genome structure of conifers and how those issues can be mitigated or eliminated.
Conifer genomes are characterized by their large size and high abundance of repetitive material, making large-scale genotyping in conifers complicated and expensive. One of the consequences of this is that it has been difficult to generate data on genome-wide levels of genetic variation. To date, researchers have mainly employed various complexity reduction techniques to assess genetic variation across the genome in different conifer species. These methods tend to capture variation in a relatively small subset of a typical conifer genome and it is currently not clear how representative such results are. Here we take advantage of data generated in the first large-scale re-sequencing effort in Norway spruce and assess how well two commonly used complexity reduction methods, targeted capture probes and genotyping by sequencing perform in capturing genome-wide variation in Norway spruce. Our results suggest that both methods perform reasonably well for assessing genetic diversity and population structure in Norway spruce (Picea abies (L.) H. Karst.). Targeted capture probes were slightly more effective than GBS, likely due to them targeting known genomic regions whereas the GBS data contains a substantially greater fraction of repetitive regions, which sometimes can be problematic for assessing genetic diversity. In conclusion, both methods are useful for genotyping large numbers of samples and they greatly reduce the cost involved with genotyping a species with such a complex genome as Norway spruce.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.