Metabolic flux analysis requires both a reliable metabolic model and metabolic profiles in 15 characterizing metabolic reprogramming. Advances in analytic methodologies enable production 16of high-quality metabolomics datasets capturing isotopic flux. However, useful metabolic models 17 can be difficult to derive due to the lack of relatively complete atom-resolved metabolic networks 18 for a variety of organisms, including human. Here, we developed a graph coloring method that 19 creates unique identifiers for each atom in a compound facilitating construction of an atom-resolved 20 metabolic network. What is more, this method is guaranteed to generate the same identifier for 21 symmetric atoms, enabling automatic identification of possible additional mappings caused by 22 molecular symmetry. Furthermore, a compound coloring identifier derived from the corresponding 23 atom coloring identifiers can be used for compound harmonization across various metabolic 24 network databases, which is an essential first step in network integration. With the compound 25 coloring identifiers, 8865 correspondences between KEGG and MetaCyc compounds are detected, 26 with 5451 of them confirmed by other identifiers provided by the two databases. In addition, we 27 found that the Enzyme Commission numbers (EC) of reactions can be used to validate possible 28 correspondence pairs, with 1848 unconfirmed pairs validated by commonality in reaction ECs.
29Moreover, we were able to detect various issues and errors with compound representation in KEGG 30 and MetaCyc databases by compound coloring identifiers, demonstrating the usefulness of this 31 methodology for database curation.32 Keywords: Metabolomics; atom-resolved metabolic network; atom identifier; compound identifier; 33 database harmonization; graph theory; common subgraph isomorphism 34 35 1. Introduction
36Metabolic flux analysis is an essential approach to access metabolic phenotypes[1-2] that 37 requires both reliable metabolic profiles as well as metabolic models [3][4][5]. Advances in analytical 38 technologies like mass spectrometry (MS) and nuclear magnetic resonance (NMR) greatly 39 contribute to the detection of thousands of metabolites from biofluids, cells and tissues [6].
40Application of those analytical techniques to stable isotope resolved metabolomics (SIRM) 41 experiments facilitates production of high-quality metabolomics datasets capturing isotopic flux 42 through cellular and systemic metabolism [7][8]. Now, the challenge is to construct meaningful 43 metabolic models from the corresponding metabolic profiles for downstream metabolic flux 44 2 of 15 analysis. A metabolic network is usually represented by compounds connected via 45 biotransformation routes [9]. Obviously, information at the atom level is not represented in such 46 metabolic networks, making it impractical to derive appropriate metabolic models for SIRM 47 datasets. However, currently there is no relatively complete atom-resolved databases of metabolic 48 networks available for human metabolis...