Annotating protein sequences according to their biological functions is one of the key steps in understanding microbial diversity, metabolic potentials and evolutionary histories. However, even in the best-studied prokaryotic genomes, not all proteins can be characterized by classical in vivo, in vitro, and/or in silico methods—a challenge rapidly growing alongside the advent of Next Generation Sequencing technologies and their enormous extension of ‘omics’ data in public databases. These so-called hypothetical proteins (HPs) represent a huge knowledge gap and hidden potential for biotechnological applications. Opportunities for leveraging the available ‘Big Data’ have recently proliferated with the use of artificial intelligence (AI). Here we review the aims and methods of protein annotation and explain the different principles behind machine and deep learning algorithms including recent research examples, in order to assist both biologists wishing to apply AI tools in developing comprehensive genome annotations and computer scientists who want to contribute to this leading edge of biological research.
Natural evolution has produced an almost infinite variety of microorganisms that can colonize almost any conceivable habitat. Since the vast majority of these microbial consortia are still unknown, there is a great need to elucidate this "microbial dark matter" (MDM) to enable exploitation in biotechnology. We report the fabrication and application of a novel device that integrates a matrix of macroporous elastomeric silicone foam (MESIF) into an easily fabricated and scalable chip design that can be used for decoding MDM in environmental microbiomes. Technical validation, performed with the model organism Escherichia coli expressing a fluorescent protein, showed that this low-cost, bioinert, and widely modifiable chip is rapidly colonized by microorganisms. The biological potential of the chip was then illustrated through targeted sampling and enrichment of microbiomes in a variety of habitats ranging from wet, turbulent moving bed biofilters and wastewater treatment plants to dry air-based environments. Sequencing analyses consistently showed that MESIF chips are not only suitable for sampling with high robustness but also that the material can be used to detect a broad cross section of microorganisms present in the habitat in a short time span of a few days. For example, results from the biofilter habitat showed efficient enrichment of microorganisms belonging to the enigmatic Candidate Phyla Radiation, which comprise ∼70% of the MDM. From dry air, the MESIF chip was able to enrich a variety of members of Actinobacteriota, which is known to produce specific secondary metabolites. Targeted sampling from a wastewater treatment plant where the herbicide glyphosate was added to the chip's reservoir resulted in enrichment of Cyanobacteria and Desulfobacteria, previously associated with glyphosate degradation. These initial case studies suggest that this chip is very well suited for the systematic study of MDM and opens opportunities for the cultivation of previously unculturable microorganisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.