Acute toxicity is one of the most challenging properties to predict purely with computational methods due to its direct relationship to biological interactions. Moreover, toxicity can be represented by different endpoints: it can be measured for different species using different types of administration, etc., and it is questionable if the knowledge transfer between endpoints is possible. We performed a comparative study of prediction multi-task toxicity for a broad chemical space using different descriptors and modeling algorithms and applied multi-task learning for a large toxicity dataset extracted from the Registry of Toxic Effects of Chemical Substances (RTECS). We demonstrated that multi-task modeling provides significant improvement over singleoutput models and other machine learning methods. Our research reveals that multi-task learning can be very useful to improve the quality of acute toxicity modeling and raises a discussion about the usage of multi-task approaches for regulation purposes.
In this work, we present graph-convolutional neural networks for the prediction of binding constants of protein−ligand complexes. We derived the model using multi task learning, where the target variables are the dissociation constant (K d ), inhibition constant (K i ), and half maximal inhibitory concentration (IC 50 ). Being rigorously trained on the PDBbind dataset, the model achieves the Pearson correlation coefficient of 0.87 and the RMSE value of 1.05 in pK units, outperforming recently developed 3D convolutional neural network model K deep .
Despite the increasing volume of available data, the proportion of experimentally measured data remains small compared to the virtual chemical space of possible chemical structures. Therefore, there is a strong interest in simultaneously predicting different ADMET and biological properties of molecules, which are frequently strongly correlated with one another. Such joint data analyses can increase the accuracy of models by exploiting their common representation and identifying common features between individual properties. In this work we review the recent developments in multi‐learning approaches as well as cover the freely available tools and packages that can be used to perform such studies.
We present a novel
approach for the increasing reliability of compound
identification for LC-MS and MALDI imaging lipidomics. Our approach
is based on the characterization of compounds not only by the elution
time, accurate mass, and fragmentation spectra but also by the number
of labile hydrogens that can be measured using the hydrogen/deuterium
(H/D) exchange approach. The number of labile hydrogens (those from
−OH and −NH groups) serves as an additional structural
descriptor used when performing a database search. For LC-MS experiment,
the H/D exchange was performed in the heating capillary of the modified
electrospray ionization (ESI) source, while for MALDI imaging, the
exchange was performed in the ion funnel at 10 Torr pressure. It was
observed that such an approach allowed one to achieve a considerable
degree of deuteration, enough to unambiguously distinguish between
different classes of lipids. The proposed analytical approach may
be successfully used for the identification not only of lipids but
also of peptides and metabolites. A special software for the automatic
filtration of molecules based on the number of functional groups was
also developed.
We developed a Transformer-based artificial neural approach to translate between SMILES and IUPAC chemical notations: Struct2IUPAC and IUPAC2Struct. The overall performance level of our model is comparable to the rule-based solutions. We proved that the accuracy and speed of computations as well as the robustness of the model allow to use it in production. Our showcase demonstrates that a neural-based solution can facilitate rapid development keeping the required level of accuracy. We believe that our findings will inspire other developers to reduce development costs by replacing complex rule-based solutions with neural-based ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.