MassBank is the first public repository of mass spectra of small chemical compounds for life sciences (<3000 Da). The database contains 605 electron-ionization mass spectrometry (EI-MS), 137 fast atom bombardment MS and 9276 electrospray ionization (ESI)-MS(n) data of 2337 authentic compounds of metabolites, 11 545 EI-MS and 834 other-MS data of 10,286 volatile natural and synthetic compounds, and 3045 ESI-MS(2) data of 679 synthetic drugs contributed by 16 research groups (January 2010). ESI-MS(2) data were analyzed under nonstandardized, independent experimental conditions. MassBank is a distributed database. Each research group provides data from its own MassBank data servers distributed on the Internet. MassBank users can access either all of the MassBank data or a subset of the data by specifying one or more experimental conditions. In a spectral search to retrieve mass spectra similar to a query mass spectrum, the similarity score is calculated by a weighted cosine correlation in which weighting exponents on peak intensity and the mass-to-charge ratio are optimized to the ESI-MS(2) data. MassBank also provides a merged spectrum for each compound prepared by merging the analyzed ESI-MS(2) data on an identical compound under different collision-induced dissociation conditions. Data merging has significantly improved the precision of the identification of a chemical compound by 21-23% at a similarity score of 0.6. Thus, MassBank is useful for the identification of chemical compounds and the publication of experimental data.
Perspective │2 Artificial intelligence (AI) tools are increasingly being applied in drug discovery. Whilst some protagonists point to vast opportunities potentially offered by such tools, others remain skeptical, waiting for a clear impact to be shown in drug discovery projects. The truth is probably somewhere between these extremes, but it is clear that AI is providing new challenges not only for the scientists involved but also for the biopharma industry and its established processes for discovering and developing new medicines. This article presents the views of a diverse group of international experts on the 'grand challenges' for small-molecule drug discovery with AI and approaches to address them.
A reaction prediction system called SOPHIA (System for organic reaction prediction by heuestic qproach) has been developed to predict possible products and the product ratio from arbitrary reactants under arbitrary reaction conditions. As a first step in developing SOPHIA we used the reaction knowledge base of the organic synthesis design system AIPHOS, which was derived from a reaction database, as a general knowledge base for reaction prediction. It became possible to automatically perceive a reaction site and to predict possible reaction paths without the user's designation of a specific reaction type or category. This paper describes the philosophy of SOPHIA and the current level of development together with an overview and first results.
The GAPLS (GA based PLS) program has been developed for variable selection in QSAR studies. The modified GA was employed to obtain a PLS model with high internal predictivity using a small number of variables. In order to show the performance of GAPLS for variable selection, the program was applied to the inhibitor activity of calcium channel antagonists. As a result, variables largely contributing to the inhibitory activity could be selected, and the structural requirements for the inhibitory activity could be estimated in an effective manner.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.