Prophages are phages in lysogeny that are integrated into, and replicated as part of, the host bacterial genome. These mobile elements can have tremendous impact on their bacterial hosts’ genomes and phenotypes, which may lead to strain emergence and diversification, increased virulence or antibiotic resistance. However, finding prophages in microbial genomes remains a problem with no definitive solution. The majority of existing tools rely on detecting genomic regions enriched in protein-coding genes with known phage homologs, which hinders the de novo discovery of phage regions. In this study, a weighted phage detection algorithm, PhiSpy was developed based on seven distinctive characteristics of prophages, i.e. protein length, transcription strand directionality, customized AT and GC skew, the abundance of unique phage words, phage insertion points and the similarity of phage proteins. The first five characteristics are capable of identifying prophages without any sequence similarity with known phage genes. PhiSpy locates prophages by ranking genomic regions enriched in distinctive phage traits, which leads to the successful prediction of 94% of prophages in 50 complete bacterial genomes with a 6% false-negative rate and a 0.66% false-positive rate.
BackgroundThe SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups.ResultsThe currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java.ConclusionsWe present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.
Oxygen minimum zones (OMZs) are oceanographic features that affect ocean productivity and biodiversity, and contribute to ocean nitrogen loss and greenhouse gas emissions. Here we describe the viral communities associated with the Eastern Tropical South Pacific (ETSP) OMZ off Iquique, Chile for the first time through abundance estimates and viral metagenomic analysis. The viral-to-microbial ratio (VMR) in the ETSP OMZ fluctuated in the oxycline and declined in the anoxic core to below one on several occasions. The number of viral genotypes (unique genomes as defined by sequence assembly) ranged from 2040 at the surface to 98 in the oxycline, which is the lowest viral diversity recorded to date in the ocean. Within the ETSP OMZ viromes, only 4.95% of genotypes were shared between surface and anoxic core viromes using reciprocal BLASTn sequence comparison. ETSP virome comparison with surface marine viromes (Sargasso Sea, Gulf of Mexico, Kingman Reef, Chesapeake Bay) revealed a dissimilarity of ETSP OMZ viruses to those from other oceanic regions. From the 1.4 million non-redundant DNA sequences sampled within the altered oxygen conditions of the ETSP OMZ, more than 97.8% were novel. Of the average 3.2% of sequences that showed similarity to the SEED non-redundant database, phage sequences dominated the surface viromes, eukaryotic virus sequences dominated the oxycline viromes, and phage sequences dominated the anoxic core viromes. The viral community of the ETSP OMZ was characterized by fluctuations in abundance, taxa and diversity across the oxygen gradient. The ecological significance of these changes was difficult to predict; however, it appears that the reduction in oxygen coincides with an increased shedding of eukaryotic viruses in the oxycline, and a shift to unique viral genotypes in the anoxic core.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.