The species-rich diatom family Chaetocerotaceae is common in the coastal marine phytoplankton worldwide where it is responsible for a substantial part of the primary production. Despite its relevance for the global cycling of carbon and silica, many species are still described only morphologically, and numerous specimens do not fit any described taxa. Nowadays, studies to assess plankton biodiversity deploy high throughput sequencing metabarcoding of the 18S rDNA V4 region, but to translate the gathered metabarcodes into biologically meaningful taxa, there is a need for reference barcodes. However, 18S reference barcodes for this important family are still relatively scarce. We provide 18S rDNA and partial 28S rDNA reference sequences of 443 morphologically characterized chaetocerotacean strains. We gathered 164 of the 216 18S sequences and 244 of the 413 28S sequences of strains from the Gulf of Naples, Atlantic France, and Chile. Inferred phylogenies showed 84 terminal taxa in seven principal clades. Two of these clades included terminal taxa whose rDNA sequences contained spliceosomal and Group IC1 introns. Regarding the commonly used metabarcode markers in planktonic diversity studies, all terminal taxa can be discriminated with the 18S V4 hypervariable region; its primers fit their targets in all but two species, and the V4-tree topology is similar to that of the 18S. Hence V4-metabarcodes of unknown Chaetocerotaceae are assignable to the family. Regarding the V9 hypervariable region, most terminal taxa can be discriminated, but several contain introns in their primer targets. Moreover, poor phylogenetic resolution of the V9 region affects placement of metabarcodes of putative but unknown chaetocerotacean taxa, and hence, uncertainty in taxonomic assignment, even of higher taxa.
Chaetoceros is one of the most species rich, widespread and abundant diatom genera in marine and brackish habitats worldwide. It therefore forms an excellent model for in-depth biodiversity studies, assessing morphological and genetic differentiation among groups of strains. The global Chaetoceros lorenzianus complex presently comprises three species known to science. However, our recent studies have shown that the group includes several previously unknown species. In this article, 50 strains, mainly from high latitudes and from warm-temperate waters, were examined morphologically and genetically and the results compared with those of field studies from elsewhere. The strains clustered into five groups, two of which are formed by C. decipiens Cleve and C. mitra (Bailey) Cleve, respectively. Their species descriptions are emended based on samples collected close to the type localities. The three other groups are formed by new species, C. elegans sp. nov., C. laevisporus sp. nov. and C. mannaii sp. nov. Characters used to distinguish each species are: orientation of setae, shape and size of the apertures, shape, size and density of the poroids on the setae and, at least in some species, characters of the resting spores. Our aim is to cover the global species diversity in this complex, as correct species delineation is the basis for exploring biodiversity, distribution of organisms, interactions in the food web and effects of environmental changes.
Information on taxa distribution is a prerequisite for many research fields, and biological records are a major source of data contributing to biogeographic studies. The Global Biodiversity Information Facility (GBIF) and the Ocean Biogeographic Information System (OBIS) are important infrastructures facilitating free and open access to classical biological data from several sources in both temporal and spatial scales. Over the last ten years, high throughput sequencing (HTS) metabarcoding data have become available, which constitute a great source of detailed occurrence data. Among the global sampling projects that have contributed to such data are Tara Oceans and the Ocean Sampling Day (OSD). Integration of classical and metabarcoding data may aid a more comprehensive assessment of the geographic range of species, especially of microscopic ones such as protists. Rare, small and cryptic species are often ignored in surveys or mis-assigned with the classical approaches. Here we show how integration of data from various sources can contribute to insight in the biogeography and diversity at the genus- and species-level using Chaetoceros as study system, one of the most diverse and abundant genera among marine planktonic diatoms. Chaetoceros records were extracted from GBIF and OBIS and literature data were collected by means of a Google Scholar search. Chaetoceros references barcodes where mapped against the metabarcode datasets of Tara Oceans (210 sites) and OSD (144 sites). We compared the resolution of different data sources in determining the global distribution of the genus and provided examples, at the species level, of detection of cryptic species, endemism and cosmopolitan or restricted distributions. Our results highlighted at genus level a comparable picture from the different sources but a more complete assessment when data were integrated. Both the importance of the integration but also the challenges related to it were illustrated. Chaetoceros data collected in this study are organised and available in the form of tables and maps, providing a powerful tool and a baseline for further research in e.g., ecology, conservation and evolutionary biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.