For the adoption of massively parallel sequencing (MPS) systems by forensic laboratories, validation studies on specific workflows are needed to support the feasibility of implementation and the reliability of the data they produce. As such, the whole mitochondrial genome sequencing methodology—Precision ID mtDNA Whole Genome Panel, Ion Chef, Ion S5, and Converge—has been subjected to a variety of developmental validation studies. These validation studies were completed in accordance with the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines and assessed reproducibility, repeatability, accuracy, sensitivity, specificity to human DNA, and ability to analyze challenging (e.g., mixed, degraded, or low quantity) samples. Intra- and inter-run replicates produced an average maximum pairwise difference in variant frequency of 1.2%. Concordance with data generated with traditional Sanger sequencing and an orthogonal MPS platform methodology was used to assess accuracy, and generation of complete and concordant haplotypes at DNA input levels as low as 37.5 pg of nuclear DNA or 187.5 mitochondrial genome copies illustrated the sensitivity of the system. Overall, data presented herein demonstrate that highly accurate and reproducible results were generated for a variety of sample qualities and quantities, supporting the reliability of this specific whole genome mitochondrial DNA MPS system for analysis of forensic biological evidence.
Motivation
Assays in mitochondrial genomics rely on accurate read mapping and variant calling. However, there are known and unknown nuclear paralogs that have fundamentally different genetic properties than that of the mitochondrial genome. Such paralogs complicate the interpretation of mitochondrial genome data and confound variant calling.
Results
RtN! was developed to categorize reads from massively parallel sequencing data not based on the expected properties and sequence identities of paralogous Numts, but instead using sequence similarity to a large database of publicly available mitochondrial genomes. RtN! removes low-level sequencing noise and mitochondrial paralogs while not impacting variant calling, while competing methods were shown to remove true variants from mitochondrial mixtures.
Availability
https://github.com/Ahhgust/RtN
Supplementary information
Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.