Small data materials design with machine learning: When the average model knows best

Vanpoucke, Danny E. P.; Knippenberg, Onno S. J. van; Hermans, Ko; Bernaerts, Katrien V.; Mehrkanoon, Siamak

doi:10.1063/5.0012285

Cited by 19 publications

(19 citation statements)

References 62 publications

(95 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The optimized structures of the 21 selected drugs lascufloxacin (1), pretomanid (2), relebactam (3), triclabendazole (4), esaxerenone (5), voxelotor (6), cenobamate (7), lasmiditan (8), mirogabalin (9), remimazolam (10), solriamfetol (11), ubrogepant (12), benvitimod (13), trifarotene (14), upadacitinib (15), sotagliflozin (16), alpelisib (17), erdafitinib (18), zanubrutinib (19), relugolix (20), and elexacaftor (21) are given in Figure 2, and the SMILES notations and chemical formulas are given in Table 1.…”

Section: ■ Results and Discussionmentioning

confidence: 99%

“…The significant contribution of computational and theoretical studies of quantum chemistry has allowed medicinal chemists to obtain more precise molecular properties and bioactivity of drugs in a shorter time. − Due to the evolution of computing data storage and higher processor performance, molecular modeling has been very efficient to solve drug-related issues without compromising the accuracy of the predicted data. − Investigation of the mechanism of action of drugs on therapeutic targets can be carried out using structure–activity relationship (SAR) and quantitative structure–activity relationship (QSAR) models. The structures of drugs and their activities or properties are studied using molecular modeling methods and statistical methods.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Theoretical Studies on the Molecular Properties, Toxicity, and Biological Efficacy of 21 New Chemical Entities

Srivastava

2021

ACS Omega

View full text Add to dashboard Cite

New chemical entities (NCEs) such as small molecules and antibody–drug conjugates have strong binding affinity for biological targets, which provide deep insights into structure-specific interactions for the design of future drugs. As structures of drugs increase in complexity, the importance of computational predictions comes into sharp focus. Knowledge of various computational tools enables us to predict the molecular properties, toxicity, and biological efficacy of the drugs and help the medicinal chemists to discover new drugs more efficiently. Newly approved drugs have higher affinities for proteins and nucleic acids and are applied for the treatment of human diseases. We have carried out the computational studies of 21 such NCEs, specifically small molecules and antibody–drug conjugates, and studied the biological efficacy of these drugs. Their bioactivity score and molecular and pharmacokinetic properties were evaluated using online computer software programs, viz., Molinspiration and Osiris Property Explorer. The SwissTargetPrediction tool was used for the efficient prediction of protein targets for the NCEs. The results indicated higher stability for the drug complexes due to a larger HOMO–LUMO gap. A high electrophilicity index reflects good electrophilic behavior and high reactivity of the drugs. Lipinski’s ‘‘rule of five’’ indicated that most of the drug complexes are likely to be orally active. These drugs also showed non-mutagenic, non-tumorigenic, non-irritant, and non-effective reproductive behavior. We hope that these studies will provide an insight into molecular recognition and definitely help the medicinal chemists to design new drugs in future.

show abstract

Section: ■ Results and Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Theoretical Studies on the Molecular Properties, Toxicity, and Biological Efficacy of 21 New Chemical Entities

Srivastava

2021

ACS Omega

View full text Add to dashboard Cite

show abstract

“…Vanpoucke et al. has recently highlighted the powers of machine learning for materials design, suggesting a novel “ensemble-average” model for use with smaller data sets specifically …”

Section: High Content Data Collection and Analysis: Seeing More Than ...mentioning

confidence: 99%

“…Vanpoucke et al has recently highlighted the powers of machine learning for materials design, suggesting a novel "ensemble-average" model for use with smaller data sets specifically. 525 In the case of linear regression, two important assumptions are made: (1) the outcome is a continuous variable and (2) that it is normally distributed. However, in reality, this is not always the case.…”

Section: Regression Analysismentioning

confidence: 99%

High-Throughput Routes to Biomaterials Discovery

2021

View full text Add to dashboard Cite

Many existing clinical treatments are limited in their ability to completely restore decreased or lost tissue and organ function, an unenviable situation only further exacerbated by a globally aging population. As a result, the demand for new medical interventions has increased substantially over the past 20 years, with the burgeoning fields of gene therapy, tissue engineering, and regenerative medicine showing promise to offer solutions for full repair or replacement of damaged or aging tissues. Success in these fields, however, inherently relies on biomaterials that are engendered with the ability to provide the necessary biological cues mimicking native extracellular matrixes that support cell fate. Accelerating the development of such “directive” biomaterials requires a shift in current design practices toward those that enable rapid synthesis and characterization of polymeric materials and the coupling of these processes with techniques that enable similarly rapid quantification and optimization of the interactions between these new material systems and target cells and tissues. This manuscript reviews recent advances in combinatorial and high-throughput (HT) technologies applied to polymeric biomaterial synthesis, fabrication, and chemical, physical, and biological screening with targeted end-point applications in the fields of gene therapy, tissue engineering, and regenerative medicine. Limitations of, and future opportunities for, the further application of these research tools and methodologies are also discussed.

show abstract

“…The datasets generated during human-driven experiments vary between 5 and 20 data points resulting from a limited set of experimental conditions decided by the researcher or based on trials and errors. As a general rule, small datasets must be treated using low complexity models to avoid overfitting and are often well handled using polynomial fitting techniques [29]. Still, experimentally-generated small datasets might carry a high correlation and complexity level, requiring analysis using ML methods.…”

Section: Handling Small Dataset With Machine Learningmentioning

confidence: 99%

Accelerating the Design of Photocatalytic Surfaces for Antimicrobial Application: Machine Learning Based on a Sparse Dataset

et al. 2021

View full text Add to dashboard Cite

Nowadays, most experiments to synthesize and test photocatalytic antimicrobial materials are based on trial and error. More often than not, the mechanism of action of the antimicrobial activity is unknown for a large spectrum of microorganisms. Here, we propose a scheme to speed up the design and optimization of photocatalytic antimicrobial surfaces tailored to give a balanced production of reactive oxygen species (ROS) upon illumination. Using an experiment-to-machine-learning scheme applied to a limited experimental dataset, we built a model that can predict the photocatalytic activity of materials for antimicrobial applications over a wide range of material compositions. This machine-learning-assisted strategy offers the opportunity to reduce the cost, labor, time, and precursors consumed during experiments that are based on trial and error. Our strategy may significantly accelerate the large-scale deployment of photocatalysts as a promising route to mitigate fomite transmission of pathogens (bacteria, viruses, fungi) in hospital settings and public places.

show abstract

Small data materials design with machine learning: When the average model knows best

Cited by 19 publications

References 62 publications

Theoretical Studies on the Molecular Properties, Toxicity, and Biological Efficacy of 21 New Chemical Entities

Theoretical Studies on the Molecular Properties, Toxicity, and Biological Efficacy of 21 New Chemical Entities

High-Throughput Routes to Biomaterials Discovery

Accelerating the Design of Photocatalytic Surfaces for Antimicrobial Application: Machine Learning Based on a Sparse Dataset

Contact Info

Product

Resources

About