2017
DOI: 10.1007/s11030-017-9729-8
|View full text |Cite
|
Sign up to set email alerts
|

Practical application of the Average Information Content Maximization (AIC-MAX) algorithm: selection of the most important structural features for serotonin receptor ligands

Abstract: The Average Information Content Maximization algorithm (AIC-MAX) based on mutual information maximization was recently introduced to select the most discriminatory features. Here, this methodology was applied to select the most significant bits from the Klekota-Roth fingerprint for serotonin receptors ligands as well as to select the most important features for distinguishing ligands with activity for one receptor versus another. The interpretation of selected bits and machine-learning experiments performed us… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 32 publications
0
3
0
Order By: Relevance
“…Since the time complexity of the simplest logistic regression model is linear with respect to data dimension, the use of Pharmacoprint (39 973 bits) can be around 39 times slower than typical structural fingerprints, such as ECFP4 or Ext (1024 bits). To reduce the time processing of Pharmacoprint, one can consider the application of reduction algorithms, which allows one to not only save CPU time but also improve the results. ,, For this purpose, 100-dimensional representations generated by PCA, AE, and sAE were considered (although the number of 100 bits has been taken as an exemplary value, similar MCC values are obtained for the other cutoffs; see Figures S8–S10). An additional case which considers the number of principal components that explain 90% of the variance was also studied.…”
Section: Results and Discussionmentioning
confidence: 99%
“…Since the time complexity of the simplest logistic regression model is linear with respect to data dimension, the use of Pharmacoprint (39 973 bits) can be around 39 times slower than typical structural fingerprints, such as ECFP4 or Ext (1024 bits). To reduce the time processing of Pharmacoprint, one can consider the application of reduction algorithms, which allows one to not only save CPU time but also improve the results. ,, For this purpose, 100-dimensional representations generated by PCA, AE, and sAE were considered (although the number of 100 bits has been taken as an exemplary value, similar MCC values are obtained for the other cutoffs; see Figures S8–S10). An additional case which considers the number of principal components that explain 90% of the variance was also studied.…”
Section: Results and Discussionmentioning
confidence: 99%
“…Notably, each bit of fingerprint was considered as a single feature in our study; thus, the optimal feature set comprises hybrid fingerprints and descriptors after the feature reduction process. By removing irrelevant bits from the original intact fingerprint, a hybrid fingerprint can achieve increased prediction accuracy as well as reduced computational cost (Williams, 2006; Nisius and Bajorath, 2009, 2010; Singla et al, 2013; Smieja and Warszycki, 2016; Warszycki et al, 2017).…”
Section: Algorithms and Methodsmentioning
confidence: 99%
“…Notably, each bit of fingerprint was considered as a single feature in our study, thus the optimal feature set comprises of hybrid fingerprints and descriptors after feature reduction process. By removing irrelevant bits from original intact fingerprint, hybrid fingerprint can achieve increased prediction accuracy, as well as reduced computational cost [51][52][53][54][55][56].…”
Section: [Figure1 Insert Here]mentioning
confidence: 99%