Phylogenetic studies are increasingly reliant on next‐generation sequencing. Transcriptomic and hybrid enrichment sequencing techniques remain the most prevalent methods for phylogenomic data collection due to their relatively low demands for computing powers and sequencing prices, compared to whole‐genome sequencing (WGS). However, the transcriptome‐based method is constrained by the availability of fresh materials and hybrid enrichment is limited by genomic resources necessary in probe designs, especially for non‐model organisms. We present a novel WGS‐based pipeline for extracting essential phylogenomic markers through rapid de novo genome assembling from low‐coverage genome data, employing a series of computationally efficient bioinformatic tools. We tested the pipeline on a Hexapoda dataset and a more focused Phthiraptera dataset (genome sizes 0.1–2 Gbp), and further investigated the effects of sequencing depth on target assembly success rate based on the raw data of six insect genomes (0.1–1 Gbp). Each genome assembly was completed in 2–24 hr on desktop PCs. We extracted 872–1,615 near‐universal single‐copy orthologs (Benchmarking Universal Single‐Copy Orthologs [BUSCOs]) per species. This method also enables the development of ultraconserved element (UCE) probe sets; we generated probes for Phthiraptera based on our WGS assemblies, containing 55,030 baits targeting 2,832 loci, from which we extracted 2,125–2,272 UCEs. Resulting phylogenetic trees all agreed with the currently accepted topologies, indicating that markers produced in our methods were valid for phylogenomic studies. We also showed that 10–20× sequencing coverage was sufficient to produce hundreds to thousands of targeted loci from BUSCO sets, and an even lower coverage (5×) was required for UCEs. Our study demonstrates the feasibility of conducting phylogenomics from low‐coverage WGS for a wide range of organisms without reference genomes. This new approach has major advantages in data collection, particularly in reducing sequencing cost and computing consumption, while expanding loci choices.
Background Invasive candidiasis (IC) is the most common invasive fungal infection. The epidemiology of IC in hospitalized patients has been widely investigated in many metropolitan cities; however, little information from medium and small cities is known. Methods A 5-year retrospective study was carried out to analyze the prevalence, species distribution, antifungal susceptibility, risk factors and mortality of inpatients with invasive Candida infection in a regional tertiary teaching hospital in Southwest China. Results A total of 243 inpatients with invasive Candida infection during the five-year study period were identified, with a mean annual incidence of 0.41 cases per 1000 admissions and a 30-day mortality rate of 12.3%. The species distributions of Candida albicans, Candida glabrata, Candida tropicalis, Candida krusei, Candida parapsilosis and other Candida species was 45.3, 30.0, 15.2, 4.9, 2.1 and 2.5%, respectively. The total resistance rates of fluconazole (FCA), itraconazole (ITR) and voriconazole (VRC) were 18.6, 23.1 and 18.5%, respectively. Respiratory dysfunction, pulmonary infection, cardiovascular disease, chronic/acute renal failure, mechanical ventilation, abdominal surgery, intensive care in adults, septic shock and IC due to C. albicans were associated with 30-day mortality (P < 0.05) according to the univariate analyses. Respiratory dysfunction [odds ratio (OR), 9.80; 95% confidence interval (CI), 3.24–29.63; P < 0.001] and IC due to C. albicans (OR, 3.35; 95% CI, 1.13–9.92; P = 0.029) were the independent predictors of 30-day mortality. Conclusions This report shows that the incidence and mortality rates are lower and that the resistance rates to azoles are higher in medium and small cities than in large cities and that the species distributions and risk factors in medium and small cities are different from those in large cities in China. It is necessary to conduct epidemiological surveillance in medium and small cities to provide reference data for the surveillance of inpatients with IC infections.
Collembola are a basal group of Hexapoda renowned for both unique morphological characters and significant ecological roles. However, a robust and plausible phylogenetic relationship between its deeply divergent lineages has yet to be achieved. We carried out a mitophylogenomic study based on a so far the most comprehensive mitochondrial genome dataset. Our data matrix contained mitogenomes of 31 species from almost all major families of all four orders, with 16 mitogenomes newly sequenced and annotated. We compared the linear arrangements of genes along mitochondria across species. Then we conducted 13 analyses each under a different combination of character coding, partitioning scheme and heterotachy models, and assessed their performance in phylogenetic inference. Several hypothetical tree topologies were also tested. Mitogenomic structure comparison revealed that most species share the same gene order of putative ancestral pancrustacean pattern, while seven species from Onychiuridae, Poduridae and Symphypleona bear different levels of gene rearrangements, indicating phylogenetic signals. Tomoceroidea was robustly recovered for the first time in the presence of all its families and subfamilies. Monophyly of Onychiuroidea was supported using unpartitioned models alleviating LBA. Paronellidae was revealed polyphyletic with two subfamilies inserted independently into Entomobryidae. Although Entomobryomorpha has not been well supported, more than half of the analyses obtained convincing topologies by placing Tomoceroidea within or near remaining Entomobryomorpha. The relationship between elongate-shaped and sphericalshaped collembolans still remained ambiguous, but Neelipleona tend to occupy the basal position in most trees. This study showed that mitochondrial genomes could provide important information for reconstructing the relationships among Collembola when suitable analytical approaches are implemented. Of all the data refining and model selecting schemes used in this study, the combination of nucleotide sequences, partitioning model and
Traditional species delimitation only based on morphological diagnostics does not fully meet the needs of modern taxonomy. Cryptic diversity revealed by molecular evidence has been increasingly discovered in many groups; however, subsequent species description is often lacking because of inadequate taxonomy and being devoid of operational criteria. In this study, we focus on the collembolan Coecobrya which has been suspected to be a species complex living on cave guanos. Our study aimed to integrate both morphological and molecular character traits to explore this group across geographically separated cave populations. Among seven sampled populations, only minor chaetotaxic differences were detected, and between populations, there was partial overlap of discriminating characters. However, using three genes (COI, 16S and 28S), we consistently recovered across distance‐ and evolutionary model‐based molecular delimitations seven molecular lineages, corresponding to seven candidate morphospecies. A final seven‐species hypothesis was validated and seven new species were described: Coecobrya phanthuratensis sp. n., Coecobrya ranongica sp. n., Coecobrya donyoa sp. n., Coecobrya khaopaela sp. n., Coecobrya specusincola sp. n., Coecobrya khromwanaramica sp. n. and Coecobrya promdami sp. n. A tentative taxonomic workflow integrating multiple lines of evidence is proposed to facilitate the subsequent formal species description for Collembola. Unified species concept is preferable to accommodate most species concepts, delimitation criteria and data analysis methods. In practice, DNA‐based diagnoses are recommended as the standard component for the current taxonomy of Collembola, particularly within morphologically conserved groups.
Genomic data sets are increasingly central to ecological and evolutionary biology, but far fewer resources are available for invertebrates. Powerful new computational tools and the rapidly decreasing cost of Illumina sequencing are beginning to change this, enabling rapid genome assembly and reference marker extraction. We have developed and tested a practical workflow for developing genomic resources in nonmodel groups with real‐world data on Collembola (springtails), one of the most dominant soil animals on Earth. We designed universal molecular marker sets, single‐copy orthologues (BUSCOs) and ultraconserved elements (UCEs), using three existing and 11 newly generated genomes. Both marker types were tested in silico via marker capture success and phylogenetic performance. The new genomes were assembled with Illumina short reads and 9,585‒14,743 protein‐coding genes were predicted with ab initio and protein homology evidence. We identified 1,997 benchmarking universal single‐copy orthologues (BUSCOs) across 14 genomes and created and assessed a custom BUSCO data set for extracting single‐copy genes. We also developed a new UCE probe set containing 46,087 baits targeting 1,885 loci. We successfully captured 1,437‒1,865 BUSCOs and 975‒1,186 UCEs across 14 genomes. Phylogenomic reconstructions using these markers proved robust, giving new insight on deep‐time collembolan relationships. Our study demonstrates the feasibility of generating thousands of universal markers from highly efficient whole‐genome sequencing, providing a valuable resource for genome‐scale investigations in evolutionary biology and ecology.
Sinella curviseta , among the most widespread springtails (Collembola) in Northern Hemisphere, has often been treated as a model organism in soil ecology and environmental toxicology. However, little information on its genetic knowledge severely hinders our understanding of its adaptations to the soil habitat. We present the largest genome assembly within Collembola using ∼44.86 Gb (118X) of single-molecule real-time Pacific Bioscience Sequel sequencing. The final assembly of 599 scaffolds was ∼381.46 Mb with a N50 length of 3.28 Mb, which captured 95.3% complete and 1.5% partial arthropod Benchmarking Universal Single-Copy Orthologs ( n = 1066). Transcripts and circularized mitochondrial genome were also assembled. We predicted 23,943 protein-coding genes, of which 83.88% were supported by transcriptome-based evidence and 82.49% matched protein records in UniProt. In addition, we also identified 222,501 repeats and 881 noncoding RNAs. Phylogenetic reconstructions for Collembola support Tomoceridae sistered to the remaining Entomobryomorpha with the position of Symphypleona not fully resolved. Gene family evolution analyses identified 9,898 gene families, of which 156 experienced significant expansions or contractions. Our high-quality reference genome of S. curviseta provides the genetic basis for future investigations in evolutionary biology, soil ecology, and ecotoxicology.
Body scales are fundamental in the classification of Entomobryidae at all taxonomical levels. Traditionally, scales on dens were considered to be absent in Entomobryinae, but present in other scaled subfamilies; however, this opinion was strongly challenged by recent morphological advances in tergal specialised chaetae (S-chaetae). A new genus, Lepidodens, is strikingly similar to the scaled Entomobryinae genus Willowsia in having pointed scales with relatively long ribs and 2, 2|1, 2, 2, 8, 3 tergal S-chaetae, but differs from it in having dental scales and a unique position of S-microchaetae on the first abdominal segment. Multilocus phylogeny and topology tests also support this view, the new genus clustering with Entomobryinae rather than Seirinae. Three new species, L. nigrofasciatus, L. similis and L. hainanicus, are described from South China. This study clearly undermines the traditional separation of Entomobryinae and Seirinae/Lepidocyrtinae, and demonstrates that dental scales could occur in all entomobryid subfamilies containing scaled taxa. In this new phylogenetic hypothesis, Entomobryinae has the greatest diversity in scale morphology and distribution among scaled collembolan groups, indicating multiple independent origins of scales.
Mitogenomes have been widely used as markers to reconstruct phylogenies of various groups of arthropods, but specifically for Collembola they have not been useful to resolve the relationships between some families, such as Paronellidae and Entomobryidae. Here, we present a phylogenetic study integrating previously published data and 20 new mitogenomes, totalling 54 species of Entomobryoidea and two external groups. Eight of the nine subfamilies were included, with species from the most representative genera. The new mitogenomes were sequenced, assembled and annotated, resulting in sequences with a length of approximately 14,000 bp. Phylogenetic analyses were conducted based on the 13 protein‐ and 2 rRNAs‐encoding genes of the 56 mitogenomes. Both maximum likelihood (with six different datasets/models) and Bayesian inference analyses were performed. Orchesellidae, Seirinae and Lepidocyrtinae were reaffirmed as monophyletic groups, while the phylogenetic relationships between Paronellidae and Entomobryidae remain confused. A complete resolution of the Entomobryoidea phylogeny will require a comprehensive genomic sampling of the most informative nuclear and mitochondrial markers to finally overcome traditional systematic problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.