Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences—the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The ‘environmental packages’ apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.
The Minimum Information for Biological and Biomedical Investigations (MIBBI) project provides a resource for those exploring the range of extant minimum information checklists and fosters coordinated development of such checklists.
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open ‘data commoning’ culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared ‘Investigation-Study-Assay’ framework to support that vision.
Summary
Studying fungal biodiversity using data generated from Illumina MiSeq sequencing platforms poses a number of bioinformatic challenges with the analysis typically involving a large number of tools for each analytical step from quality filtering to generating identified operational taxonomic unit (OTU) abundance tables.Here, we introduce PIPITS, an open‐source stand‐alone suite of software for automated processing of Illumina MiSeq sequences for fungal community analysis. PIPITS exploits a number of state of the art applications to process paired‐end reads from quality filtering to producing OTU abundance tables.We provide detailed descriptions of the pipeline and show its utility in the analysis of 9 396 092 sequences generated on the MiSeq platform from Illumina MiSeq.
PIPITS is the first automated bioinformatics pipeline dedicated for fungal ITS sequences which incorporates ITSx to extract subregions of ITS and exploits the latest RDP Classifier to classify sequences against the curated UNITE fungal data set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.