The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.
We describe the use of model-driven analysis of multiple data types relevant to transcriptional regulation of metabolism to discover novel regulatory mechanisms in Saccharomyces cerevisiae. We have reconstructed the nutrient-controlled transcriptional regulatory network controlling metabolism in S. cerevisiae consisting of 55 transcription factors regulating 750 metabolic genes, based on information in the primary literature. This reconstructed regulatory network coupled with an existing genome-scale metabolic network model allows in silico prediction of growth phenotypes of regulatory gene deletions as well as gene expression profiles. We compared model predictions of gene expression changes in response to genetic and environmental perturbations to experimental data to identify potential novel targets for transcription factors. We then identified regulatory cascades connecting transcription factors to the potential targets through a systematic model expansion strategy using published genome-wide chromatin immunoprecipitation and binding-site-motif data sets. Finally, we show the ability of an integrated metabolic and regulatory network model to predict growth phenotypes of transcription factor knockout strains. These studies illustrate the potential of model-driven data integration to systematically discover novel components and interactions in regulatory and metabolic networks in eukaryotic cells.[Supplemental material is available online at www.genome.org.] (Giaever et al. 2002) data represent the states and outputs of these networks. Connecting large-scale component and interaction information to data on system states in order to facilitate the interpretation of both data types is a major challenge in systems biology. The data integration and interpretation task is made challenging by the incompleteness and noisiness of large-scale data sets (Grunenfelder and Winzeler 2002).Given these issues with large-scale data sets, systematic inclusion of literature-derived information on network structures into the analysis represents an appealing alternative to purely data-driven approaches. The widespread availability of component and biochemical interaction information in the primary literature has enabled the reconstruction of chemically and biologically consistent mathematical descriptions of biochemical networks in well-studied model organisms Price et al. 2004). These network models can then be used to predict changes in system states in response to genetic and environmental perturbations. Furthermore, model predictions can be directly compared with experimental data obtained, for example, by metabolic flux or gene expression profiling Price et al. 2004). As a result of these comparisons, modifications to the biochemical network model that would improve its ability to predict system states can be identified to iteratively improve the model.In the case of metabolic networks, the network reconstruction step can now be routinely done and has been accomplished for a number of key model organisms including Escherich...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.