A database (DB) describing the relationships between species and their metabolites would be useful for metabolomics research, because it targets systematic analysis of enormous numbers of organic compounds with known or unknown structures in metabolomics. We constructed an extensive species-metabolite DB for plants, the KNApSAcK Core DB, which contains 101,500 species-metabolite relationships encompassing 20,741 species and 50,048 metabolites. We also developed a search engine within the KNApSAcK Core DB for use in metabolomics research, making it possible to search for metabolites based on an accurate mass, molecular formula, metabolite name or mass spectra in several ionization modes. We also have developed databases for retrieving metabolites related to plants used for a range of purposes. In our multifaceted plant usage DB, medicinal/edible plants are related to the geographic zones (GZs) where the plants are used, their biological activities, and formulae of Japanese and Indonesian traditional medicines (Kampo and Jamu, respectively). These data are connected to the species-metabolites relationship DB within the KNApSAcK Core DB, keyed via the species names. All databases can be accessed via the website http://kanaya.naist.jp/KNApSAcK_Family/. KNApSAcK WorldMap DB comprises 41,548 GZ-plant pair entries, including 222 GZs and 15,240 medicinal/edible plants. The KAMPO DB consists of 336 formulae encompassing 278 medicinal plants; the JAMU DB consists of 5,310 formulae encompassing 550 medicinal plants. The Biological Activity DB consists of 2,418 biological activities and 33,706 pairwise relationships between medicinal plants and their biological activities. Current statistics of the binary relationships between individual databases were characterized by the degree distribution analysis, leading to a prediction of at least 1,060,000 metabolites within all plants. In the future, the study of metabolomics will need to take this huge number of metabolites into consideration.
Metabolomics, the comprehensive and global analysis of diverse metabolites produced in cells and organisms, has greatly expanded metabolite fingerprinting and profiling as well as the selection and identification of marker metabolites. The methodology typically employs multivariate analysis to statistically process the massive amount of analytical chemistry data resulting from high-throughput and simultaneous metabolite analysis. Although the technology of plant metabolomics has mainly developed with other post-genomics in systems biology and functional genomics, it is independently applied to the evaluation of the qualities of medicinal plants, based on the diversity of metabolite fingerprints resulting from multivariate analysis of non-targeted or widely targeted metabolite analysis. One advantage of applying metabolomics is that medicinal plants are evaluated based not only on the limited number of metabolites that are pharmacologically important chemicals, but also on the fingerprints of minor metabolites and bioactive chemicals. In particular, score plot and loading plot analyses e.g. principal component analysis (PCA), partial-least-squares discriminant analysis (PLS-DA), and discrimination map analysis such as batch-learning self-organizing map (BL-SOM) analysis, are often employed for the reduction of a metabolite fingerprint and the classification of analyzed samples. Based on recent studies, we now understand that metabolomics can be an effective approach for comprehensive evaluation of the qualities of medicinal plants. In this review, we describe practical cases in which metabolomic study was performed on medicinal plants, and discuss the utility of metabolomics for this research field, with focus on multivariate analysis.
Databases (DBs) are required by various omics fields because the volume of molecular biology data is increasing rapidly. In this study, we provide instructions for users and describe the current status of our metabolite activity DB. To facilitate a comprehensive understanding of the interactions between the metabolites of organisms and the chemical-level contribution of metabolites to human health, we constructed a metabolite activity DB known as the KNApSAcK Metabolite Activity DB. It comprises 9,584 triplet relationships (metabolite-biological activity-target species), including 2,356 metabolites, 140 activity categories, 2,963 specific descriptions of biological activities and 778 target species. Approximately 46% of the activities described in the DB are related to chemical ecology, most of which are attributed to antimicrobial agents and plant growth regulators. The majority of the metabolites with antimicrobial activities are flavonoids and phenylpropanoids. The metabolites with plant growth regulatory effects include plant hormones. Over half of the DB contents are related to human health care and medicine. The five largest groups are toxins, anticancer agents, nervous system agents, cardiovascular agents and non-therapeutic agents, such as flavors and fragrances. The KNApSAcK Metabolite Activity DB is integrated within the KNApSAcK Family DBs to facilitate further systematized research in various omics fields, especially metabolomics, nutrigenomics and foodomics. The KNApSAcK Metabolite Activity DB could also be utilized for developing novel drugs and materials, as well as for identifying viable drug resources and other useful compounds.
Science is going through two rapidly changing phenomena: one is the increasing capabilities of the computers and software tools from terabytes to petabytes and beyond, and the other is the advancement in high-throughput molecular biology producing piles of data related to genomes, transcriptomes, proteomes, metabolomes, interactomes, and so on. Biology has become a data intensive science and as a consequence biology and computer science have become complementary to each other bridged by other branches of science such as statistics, mathematics, physics, and chemistry. The combination of versatile knowledge has caused the advent of big-data biology, network biology, and other new branches of biology. Network biology for instance facilitates the system-level understanding of the cell or cellular components and subprocesses. It is often also referred to as systems biology. The purpose of this field is to understand organisms or cells as a whole at various levels of functions and mechanisms. Systems biology is now facing the challenges of analyzing big molecular biological data and huge biological networks. This review gives an overview of the progress in big-data biology, and data handling and also introduces some applications of networks and multivariate analysis in systems biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.