2022
DOI: 10.1021/acs.jcim.2c01290
|View full text |Cite
|
Sign up to set email alerts
|

ALipSol: An Attention-Driven Mixture-of-Experts Model for Lipophilicity and Solubility Prediction

Abstract: Lipophilicity (logD) and aqueous solubility (logS w) play a central role in drug development. The accurate prediction of these properties remains to be solved due to data scarcity. Current methodologies neglect the intrinsic relationships between physicochemical properties and usually ignore the ionization effects. Here, we propose an attention-driven mixture-of-experts (MoE) model named ALipSol, which explicitly reproduces the hierarchy of task relationships. We adopt the principle of divide-and-conquer by br… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(13 citation statements)
references
References 59 publications
(110 reference statements)
0
11
0
Order By: Relevance
“…For example, sometimes classical ML techniques as simple as decision trees or random forest can be quite effective at estimating kinetic parameters [28] or ranking reaction conformations [29], and we do not doubt that there are some molecular problems and/or datasets where KRR or other architectures may have some advantages over GNN-based approaches. Our argument instead is that comparing model performance to relevant baselines-and to experimental data when available [30][31][32][33][34][35][36][37][38][39][40][41][42]-should never be 'beyond the scope' of work for a model-developer. Rather, it is an essential first step at convincing readers that the model is useful.…”
Section: Providing Context For Model Performancementioning
confidence: 99%
“…For example, sometimes classical ML techniques as simple as decision trees or random forest can be quite effective at estimating kinetic parameters [28] or ranking reaction conformations [29], and we do not doubt that there are some molecular problems and/or datasets where KRR or other architectures may have some advantages over GNN-based approaches. Our argument instead is that comparing model performance to relevant baselines-and to experimental data when available [30][31][32][33][34][35][36][37][38][39][40][41][42]-should never be 'beyond the scope' of work for a model-developer. Rather, it is an essential first step at convincing readers that the model is useful.…”
Section: Providing Context For Model Performancementioning
confidence: 99%
“…Most tools that extract information innate to protein sequences either utilize the structural hierarchy framework or work without a hierarchical framework. Examples include predicting disease-associated mutations [6][7][8][9][10][11][12], predicting solubility [13][14][15], detecting aggregating regions [16][17][18][19][20], predicting intrinsic disorder [18,[21][22][23][24][25], designing protein sequences with specific features [26][27][28], and comparing sequences [29,30]. Some of these tools use a fixed-width moving window to define local sequence context, which artificially places the considered residue at the center of its "local sequence" and ignores any natural boundaries present within proteins.…”
Section: Introductionmentioning
confidence: 99%
“…This entails identifying promising compounds from a large pool of molecules and receiving ADMET feedback before actual synthesis. To address these issues, the development of quantitative structure–property relationship (QSPR) models, using computer technology to predict ADMET properties, has emerged as a cost-effective and efficient alternative. …”
Section: Introductionmentioning
confidence: 99%