2021
DOI: 10.3390/molecules26226959
|View full text |Cite
|
Sign up to set email alerts
|

Don’t Overweight Weights: Evaluation of Weighting Strategies for Multi-Task Bioactivity Classification Models

Abstract: Machine learning models predicting the bioactivity of chemical compounds belong nowadays to the standard tools of cheminformaticians and computational medicinal chemists. Multi-task and federated learning are promising machine learning approaches that allow privacy-preserving usage of large amounts of data from diverse sources, which is crucial for achieving good generalization and high-performance results. Using large, real world data sets from six pharmaceutical companies, here we investigate different strat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…GlaxoSmithKline released raw clinical data on new drugs in 2012, which was welcomed as the first step toward information disclosure. The movement to follow this lead is spreading worldwide, and there has been a large range of open biomedical datasets available for training new machine learning algorithms developed by governments, medical societies, and international research collaborations. The possibility of learning from a large amount of data containing similar structures is increasing. By using this method to build a prediction model using in-company data, we expect to develop a prediction model with a prediction accuracy higher than that of in vitro tests.…”
Section: Discussionmentioning
confidence: 99%
“…GlaxoSmithKline released raw clinical data on new drugs in 2012, which was welcomed as the first step toward information disclosure. The movement to follow this lead is spreading worldwide, and there has been a large range of open biomedical datasets available for training new machine learning algorithms developed by governments, medical societies, and international research collaborations. The possibility of learning from a large amount of data containing similar structures is increasing. By using this method to build a prediction model using in-company data, we expect to develop a prediction model with a prediction accuracy higher than that of in vitro tests.…”
Section: Discussionmentioning
confidence: 99%
“…Task weighting is a strategy that addresses this by assigning weights to the task loss values as they are summed, effectively suppressing antagonistic noise and allowing more synergistic training signals to be backpropagated. 20,21 Task grouping is another "informed" multitask learning strategy which involves clustering a given set of tasks into explicit subsets of synergistic tasks that are used to train separate multitask neural networks in a constrained manner free from antagonistic signals. 22−25 This work aims to investigate how toxicology domain knowledge can be used to handcraft task groupings and introduce "rule hints" that better guide the training of multitask neural networks.…”
Section: Bi)mentioning
confidence: 99%
“…These effects may be most noticeable when a given set of tasks is naïvely combined into one multitask neural network in which the loss values of antagonistic tasks are jointly backpropagated with other tasks. Task weighting is a strategy that addresses this by assigning weights to the task loss values as they are summed, effectively suppressing antagonistic noise and allowing more synergistic training signals to be backpropagated. , Task grouping is another “informed” multitask learning strategy which involves clustering a given set of tasks into explicit subsets of synergistic tasks that are used to train separate multitask neural networks in a constrained manner free from antagonistic signals.…”
Section: Introductionmentioning
confidence: 99%
“…To avoid any dominance of assays with a higher number of derived tasks during training, MELLODDY-TUNER produces a task weighting file for further machine learning which assigns 1/N weights to all tasks, with N being the number of derived tasks from that assay. 39…”
Section: Training Datasetmentioning
confidence: 99%