Tests for incongruence as an indicator of among-data partition conflict have played an important role in conditional data combination. When such tests reveal significant incongruence, this has been interpreted as a rationale for not combining data into a single phylogenetic analysis. In this study of lorisiform phylogeny, we use the incongruence length difference (ILD) test to assess conflict among three independent data sets. A large morphological data set and two unlinked molecular data sets--the mitochondrial cytochrome b gene and the nuclear interphotoreceptor retinoid binding protein (exon 1)--are analyzed with various optimality criteria and weighting mechanisms to determine the phylogenetic relationships among slow lorises (Primates, Loridae). When analyzed separately, the morphological data show impressive statistical support for a monophyletic Loridae. Both molecular data sets resolve the Loridae as paraphyletic, though with different branching orders depending on the optimality criterion or character weighting used. When the three data partitions are analyzed in various combinations, an inverse relationship between congruence and phylogenetic accuracy is observed. Nearly all combined analyses that recover monophyly indicate strong data partition incongruence (P = 0.00005 in the most extreme case), whereas all analyses that recover paraphyly indicate lack of significant incongruence. Numerous lines of evidence verify that monophyly is the accurate phylogenetic result. Therefore, this study contributes to a growing body of information affirming that measures of incongruence should not be used as indicators of data set combinability.
Phylogenetic analysis of mtDNA sequence data confirms the observation that species diversity in the world's smallest living primate (genus Microcebus) has been greatly underestimated. The description of three species new to science, and the resurrection of two others from synonymy, has been justified on morphological grounds and is supported by evidence of reproductive isolation in sympatry. This taxonomic revision doubles the number of recognized mouse lemur species. The molecular data and phylogenetic analyses presented here verify the revision and add a historical framework for understanding mouse lemur species diversity. Phylogenetic analysis revises established hypotheses of ecogeographic constraint for the maintenance of species boundaries in these endemic Malagasy primates. Mouse lemur clades also show conspicuous patterns of regional endemism, thereby emphasizing the threat of local deforestation to Madagascar's unique biodiversity.
We have sequenced the entire mtDNA genome (mtGenome) of 241 individuals who match 1 of 18 common European Caucasian HV1/HV2 types, to identify sites that permit additional forensic discrimination. We found that over the entire mtGenome even individuals with the same HV1/HV2 type rarely match. Restricting attention to sites that are neutral with respect to phenotypic expression, we have selected eight panels of single nucleotide polymorphism (SNP) sites that are useful for additional discrimination. These panels were selected to be suitable for multiplex SNP typing assays, with 7-11 sites per panel. The panels are specific for one or more of the common HV1/HV2 types (or closely related types), permitting a directed approach that conserves limiting case specimen extracts while providing a maximal chance for additional discrimination. Discrimination provided by the panels reduces the frequency of the most common type in the European Caucasian population from approximately 7% to approximately 2%, and the 18 common types we analyzed are resolved to 105 different types, 55 of which are seen only once.
The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data.
Instances of point and length heteroplasmy in the mitochondrial DNA control region were compiled and analyzed from over 5,000 global human population samples. These data represent observations from a large and broad population sample, representing nearly 20 global populations. As expected, length heteroplasmy was frequently observed in the HVI, HVII and HVIII C-stretches. Length heteroplasmy was also observed in the AC dinucleotide repeat region, as well as other locations. Point heteroplasmy was detected in approximately 6% of all samples, and while the vast majority of heteroplasmic samples comprised two molecules differing at a single position, samples exhibiting two and three mixed positions were also observed in this data set. In general, the sites at which heteroplasmy was most commonly observed correlated with reported control region mutational hotspots. However, for some sites, observations of heteroplasmy did not mirror established mutation rate data, suggesting the action of other mechanisms, both selective and neutral. Interestingly, these data indicate that the frequency of heteroplasmy differs between particular populations, perhaps reflecting variable mutation rates among different mtDNA lineages and/or artifacts of particular population groups. The results presented here contribute to our general understanding of mitochondrial DNA control region heteroplasmy and provide additional empirical information on the mechanisms contributing to mtDNA control region mutation and evolution.
Though investigations into the use of massively parallel sequencing technologies for the generation of complete mitochondrial genome (mtGenome) profiles from difficult forensic specimens are well underway in multiple laboratories, the high quality population reference data necessary to support full mtGenome typing in the forensic context are lacking. To address this deficiency, we have developed 588 complete mtGenome haplotypes, spanning three U.S. population groups (African American, Caucasian and Hispanic) from anonymized, randomly-sampled specimens. Data production utilized an 8-amplicon, 135 sequencing reaction Sanger-based protocol, performed in semi-automated fashion on robotic instrumentation. Data review followed an intensive multi-step strategy that included a minimum of three independent reviews of the raw data at two laboratories; repeat screenings of all insertions, deletions, heteroplasmies, transversions and any additional private mutations; and a check for phylogenetic feasibility. For all three populations, nearly complete resolution of the haplotypes was achieved with full mtGenome sequences: 90.3-98.8% of haplotypes were unique per population, an improvement of 7.7-29.2% over control region sequencing alone, and zero haplotypes overlapped between populations. Inferred maternal biogeographic ancestry frequencies for each population and heteroplasmy rates in the control region were generally consistent with published datasets. In the coding region, nearly 90% of individuals exhibited length heteroplasmy in the 12418-12425 adenine homopolymer; and despite a relatively high rate of point heteroplasmy (23.8% of individuals across the entire molecule), coding region point heteroplasmies shared by more than one individual were notably absent, and transversion-type heteroplasmies were extremely rare. The ratio of nonsynonymous to synonymous changes among point heteroplasmies in the protein-coding genes (1:1.3) and average pathogenicity scores in comparison to data reported for complete substitutions in previous studies seem to provide some additional support for the role of purifying selection in the evolution of the human mtGenome. Overall, these thoroughly vetted full mtGenome population reference data can serve as a standard against which the quality and features of future mtGenome datasets (especially those developed via massively parallel sequencing) may be evaluated, and will provide a solid foundation for the generation of complete mtGenome haplotype frequency estimates for forensic applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.