KEGG (http://www.genome.jp/kegg/) is a database of biological systems that integrates genomic, chemical and systemic functional information. KEGG provides a reference knowledge base for linking genomes to life through the process of PATHWAY mapping, which is to map, for example, a genomic or transcriptomic content of genes to KEGG reference pathways to infer systemic behaviors of the cell or the organism. In addition, KEGG provides a reference knowledge base for linking genomes to the environment, such as for the analysis of drug-target relationships, through the process of BRITE mapping. KEGG BRITE is an ontology database representing functional hierarchies of various biological objects, including molecules, cells, organisms, diseases and drugs, as well as relationships among them. KEGG PATHWAY is now supplemented with a new global map of metabolic pathways, which is essentially a combined map of about 120 existing pathway maps. In addition, smaller pathway modules are defined and stored in KEGG MODULE that also contains other functional units and complexes. The KEGG resource is being expanded to suit the needs for practical applications. KEGG DRUG contains all approved drugs in the US and Japan, and KEGG DISEASE is a new database linking disease genes, pathways, drugs and diagnostic markers.
The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps.
AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. We have added a collection of protein contact potentials to the AAindex as a new section. Accordingly AAindex consists of three sections now: AAindex1 for the amino acid index of 20 numerical values, AAindex2 for the amino acid substitution matrix and AAindex3 for the statistical protein contact potentials. All data are derived from published literature. The database can be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.jp/dbget-bin/www_bfind?aaindex) or downloaded by anonymous FTP (ftp://ftp.genome.jp/pub/db/community/aaindex/).
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-014-0560-6) contains supplementary material, which is available to authorized users.
Tardigrades, also known as water bears, are small aquatic animals. Some tardigrade species tolerate almost complete dehydration and exhibit extraordinary tolerance to various physical extremes in the dehydrated state. Here we determine a high-quality genome sequence of Ramazzottius varieornatus, one of the most stress-tolerant tardigrade species. Precise gene repertoire analyses reveal the presence of a small proportion (1.2% or less) of putative foreign genes, loss of gene pathways that promote stress damage, expansion of gene families related to ameliorating damage, and evolution and high expression of novel tardigrade-unique proteins. Minor changes in the gene expression profiles during dehydration and rehydration suggest constitutive expression of tolerance-related genes. Using human cultured cells, we demonstrate that a tardigrade-unique DNA-associating protein suppresses X-ray-induced DNA damage by ∼40% and improves radiotolerance. These findings indicate the relevance of tardigrade-unique proteins to tolerability and tardigrades could be a bountiful source of new protection genes and mechanisms.
KEGG Atlas is a new graphical interface to the KEGG suite of databases, especially to the systems information in the PATHWAY and BRITE databases. It currently consists of a single global map and an associated viewer for metabolism, covering about 120 KEGG metabolic pathway maps and about 10 BRITE hierarchies. The viewer allows the user to navigate and zoom the global map under the Ajax technology. The mapping of high-throughput experimental data onto the global map is the main use of KEGG Atlas. In the global metabolism map, the node (circle) is a chemical compound and the edge (line) is a set of reactions linked to a set of KEGG Orthology (KO) entries for enzyme genes. Once gene identifiers in different organisms are converted to the K number identifiers in the KO system, corresponding line segments can be highlighted in the global map, allowing the user to view genome sequence data as organism-specific pathways, gene expression data as up- or down-regulated pathways, etc. Once chemical compounds are converted to the C number identifiers in KEGG, metabolomics data can also be displayed in the global map. KEGG Atlas is available at http://www.genome.jp/kegg/atlas/.
Tardigrades are able to tolerate almost complete dehydration by reversibly switching to an ametabolic state. This ability is called anhydrobiosis. In the anhydrobiotic state, tardigrades can withstand various extreme environments including space, but their molecular basis remains largely unknown. Late embryogenesis abundant (LEA) proteins are heat-soluble proteins and can prevent protein-aggregation in dehydrated conditions in other anhydrobiotic organisms, but their relevance to tardigrade anhydrobiosis is not clarified. In this study, we focused on the heat-soluble property characteristic of LEA proteins and conducted heat-soluble proteomics using an anhydrobiotic tardigrade. Our heat-soluble proteomics identified five abundant heat-soluble proteins. All of them showed no sequence similarity with LEA proteins and formed two novel protein families with distinct subcellular localizations. We named them Cytoplasmic Abundant Heat Soluble (CAHS) and Secretory Abundant Heat Soluble (SAHS) protein families, according to their localization. Both protein families were conserved among tardigrades, but not found in other phyla. Although CAHS protein was intrinsically unstructured and SAHS protein was rich in β-structure in the hydrated condition, proteins in both families changed their conformation to an α-helical structure in water-deficient conditions as LEA proteins do. Two conserved repeats of 19-mer motifs in CAHS proteins were capable to form amphiphilic stripes in α-helices, suggesting their roles as molecular shield in water-deficient condition, though charge distribution pattern in α-helices were different between CAHS and LEA proteins. Tardigrades might have evolved novel protein families with a heat-soluble property and this study revealed a novel repertoire of major heat-soluble proteins in these anhydrobiotic animals.
BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities.Database URL: http://central.biomart.org.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.