2013
DOI: 10.1002/minf.201200110
|View full text |Cite
|
Sign up to set email alerts
|

Graph‐Based Consensus Clustering for Combining Multiple Clusterings of Chemical Structures

Abstract: Consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics. In this paper, consensus clustering is used for combining the clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Two graph-based consensus clustering methods were examined. The Quality Partition Index method (QPI) was used to evaluate t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 35 publications
0
7
0
Order By: Relevance
“…The fingerprints technique was applied using Discovery Studio software to identify the molecular structures (2D) of the similar 30 antiviral compounds in a binary format against PRD_002214. The examined descriptors were HBA and HBD [106], charge [107], hybridization [108], positive and negative ionizable groups [109], halogens, aromatics, or none of the above, and the ALogP category [110] of atoms. The study ( SA: The bits number that was computed in the antiviral compounds and PRD_002214.…”
Section: Structural Fingerprint Studymentioning
confidence: 99%
“…The fingerprints technique was applied using Discovery Studio software to identify the molecular structures (2D) of the similar 30 antiviral compounds in a binary format against PRD_002214. The examined descriptors were HBA and HBD [106], charge [107], hybridization [108], positive and negative ionizable groups [109], halogens, aromatics, or none of the above, and the ALogP category [110] of atoms. The study ( SA: The bits number that was computed in the antiviral compounds and PRD_002214.…”
Section: Structural Fingerprint Studymentioning
confidence: 99%
“…This database consists of 102,516 molecules. The MDDR subset dataset was chosen from the MDDR database which has been used for many virtual screening [33][34][35] and consensus clustering [17,19] experiments. The MDDR dataset contains eleven activity classes (8294 molecules), involving both homogeneous and heterogeneous active molecules.…”
Section: Datasetsmentioning
confidence: 99%
“…[16] However, based on the implemented methods, it was not the case if the clustering is restricted to a single consensus method. In addition, Saeed et al [17] examined the use of graph-based consensus clustering methods, cluster-based similarity partitioning algorithm (CSPA) and hypergraph partitioning algorithm (HGPA), [18] for clustering of MDDR dataset and concluded that they can improve the effectiveness of individual clusterings and provide robust and stable clustering. Moreover, Saeed et al [19] used cumulative voting-based aggregation algorithm (CVAA) to combine multiple clusterings of chemical structures and found that it can significantly improve the quality of clustering.…”
Section: Introductionmentioning
confidence: 99%
“…All the molecules in both databases were converted to Pipeline Pilot ECFC_4 (extended connectivity fingerprints and folded to size 1024 bits) [25]; MDDR and MUV data sets have been used recently by our research group in this research area [26-29]. Mathworks Matlab R2012b (UTM license) was used for coding our proposed algorithms; all calculations were run on 2.80 GHz Intel(R) Xeon(R) processors.…”
Section: Experimental Designmentioning
confidence: 99%
“…The searches were carried out using the most popular chemoinformatics databases, the MDL Drug Data Report (MDDR) [ 22 ], maximum unbiased validation (MUV) [ 23 ] and Directory of Useful Decoys (DUD) [ 24 ]. All the molecules in both databases were converted to Pipeline Pilot ECFC_4 (extended connectivity fingerprints and folded to size 1024 bits) [ 25 ]; MDDR and MUV data sets have been used recently by our research group in this research area [ 26 - 29 ]. Mathworks Matlab R2012b (UTM license) was used for coding our proposed algorithms; all calculations were run on 2.80 GHz Intel(R) Xeon(R) processors.…”
Section: Experimental Designmentioning
confidence: 99%