Modern world is incorporating highly connected heterogeneous data due to information sharing through computer and communication technology. These data lead to a complex relation where drilling down and mining are needed for understanding the actual meaning of data. Today any modern computational technique uses graph clustering as a sophisticated technology for data analysis. In this paper we implement a generalized graph clustering algorithm DPClusO with easy operating procedure and clear visualization techniques. DPClusO is enhanced version of DPClus algorithm where overlapping property of clusters is taken into consideration along with density and periphery tracking. User can select different parameters and visualization attributes to render cluster set, single cluster, hierarchical graph etc. and save these data in image and text formats. This paper discusses step by step operation of the proposed software tool using an example network of metabolites collected from KNApSAcK database. This tool successfully generated cohesive groups of structurally similar metabolites. The tool can be used for analysis of network data of any field of studies.
A number of studies have investigated the relations between structures and activities of metabolites. It has been proposed that structural similarity between metabolites implies activity similarity between them. In light of this fact we propose a method for activity prediction of secondary metabolites based on association philosophy. First we determined the structural similarity scores between targeted metabolite pairs using COMPLIG algorithm. To increase the possibility of clusters rich with known metabolites we calculated structural similarity between metabolite pairs for which activities of both or at least one metabolite is known and then selected the metabolite pairs for which the similarity score is higher than a threshold (s > 0.95). The network of such metabolite pairs was then clustered using the DPClusO algorithm. Statistically significant cluster-activity pairs were then selected using the hypergeometric test. Then biological activities of unannotated metabolites were predicted from the activity of metabolites included in the statistically overrepresented clusters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.