Here we present SMAP, a software package that implements a suite of computational tools to extract multi-allelic haplotypes using read-backed haplotyping. SMAP tools first perform accurate read processing and analyze read mapping distributions across sample sets. Then, two complementary modules can be invoked for haplotype calling: SMAP haplotype-sites combines known Single Nucleotide Polymorphisms (SNPs) and/or read mapping position polymorphisms (SMAPs) to reconstruct compressed, read-reference-encoded haplotype strings. In contrast, SMAP haplotype-window works independent of prior knowledge of polymorphisms, groups reads by locus, defines a window enclosed between two custom border sequences, and retains the entire corresponding DNA sequence as haplotype. Haplotype-window is, among many applications, especially useful for high-throughput CRISPR/Cas mutation screens. Either way, SMAP creates a single integrated haplotype call table across all loci and samples. SMAP haplotyping is extremely versatile and can be applied to highly multiplex amplicon sequencing (HiPlex), Shotgun (e.g. whole genome shotgun (WGS) sequencing, probe capture and RNA-Seq), or Genotyping-by-Sequencing (GBS) data; and to Illumina short reads, PacBio and MinION long reads. SMAP creates discrete genotype calls for individuals of any ploidy or quantitative haplotype frequency spectra for Pool-Seq data, and can scale from tens to thousands of loci and/or samples. SMAP, including the source code written in Python is available at https://gitlab.com/truttink/smap, and a detailed user manual and guidelines for accurate read processing is available at https://ngs-smap.readthedocs.io/, under the GNU Affero General Public License v3.0.
We have developed PotatoMASH (Potato Multi-Allele Scanning Haplotags), a novel low-cost, genome-scanning marker platform. We designed a panel of 339 multi-allelic regions placed at 1 Mb intervals throughout the euchromatic portion of the genome. These regions were assayed using a multiplex amplicon sequencing approach, which allows for genotyping hundreds of plants at a cost of 5 EUR/sample. We applied PotatoMASH to a population of over 700 potato lines. We obtained tetraploid dosage calls for 2012 short multi-allelic haplotypes in 334 loci, which ranged from 2 to 14 different haplotypes per locus. The system was able to diagnose the presence of targeted pest-resistance markers, to detect quantitative trait loci (QTLs) by genome-wide association studies (GWAS) in a tetraploid population, and to track variation in a diploid segregating population. PotatoMASH efficiently surveys genetic variation throughout the potato genome, and can be implemented as a single low-cost genotyping platform that will allow the routine and simultaneous application of marker-assisted selection (MAS) and other genotyping applications in commercial potato breeding programmes.
We have developed PotatoMASH (Potato Multi-Allele Scanning Haplotags), a novel low cost, genome-scanning marker platform. We designed a panel of 339 multi-allelic regions placed at 1 Mb intervals throughout the euchromatic portion of the genome. These regions were assayed using a multiplex amplicon sequencing approach, which allows to genotype hundreds of plants at a cost of €5/sample. This protocol describes the library construction part of the method. PotatoMASH libraries are made to be sequenced in Illumina sequencing platforms.
We have developed PotatoMASH (Potato Multi-Allele Scanning Haplotags), a novel low cost, genome-scanning marker platform. We designed a panel of 339 multi-allelic regions placed at 1 Mb intervals throughout the euchromatic portion of the genome. These regions were assayed using a multiplex amplicon sequencing approach, which allows to genotype hundreds of plants at a cost of €5/sample. This protocol describes the library construction part of the method. PotatoMASH libraries are made to be sequenced in Illumina sequencing platforms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.