BackgroundPredicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it. Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings. Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects.MethodsIn this paper, we propose a novel method ‘feature selection-based multi-label k-nearest neighbor method’ (FS-MLKNN), which can simultaneously determine critical feature dimensions and construct high-accuracy multi-label prediction models.ResultsComputational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results. To achieve better performances, we further develop the ensemble learning model by integrating individual feature-based FS-MLKNN models. When compared with other state-of-the-art methods, the ensemble method produces better performances on benchmark datasets.ConclusionsIn conclusion, FS-MLKNN and the ensemble method are promising tools for the side effect prediction. The source code and datasets are available in the Additional file 1.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0774-y) contains supplementary material, which is available to authorized users.
Long non-coding RNA (lncRNA) plays an important role in many important biological processes and has attracted widespread attention. Although the precise functions and mechanisms for most lncRNAs are still unknown, we are certain that lncRNAs usually perform their functions by interacting with the corresponding RNA- binding proteins. For example, lncRNA-protein interactions play an important role in post transcriptional gene regulation, such as splicing, translation, signaling, and advances in complex diseases. However, experimental verification of lncRNA-protein interactions prediction is time-consuming and laborious. In this work, we propose a computational method, named IRWNRLPI, to find the potential associations between lncRNAs and proteins. IRWNRLPI integrates two algorithms, random walk and neighborhood regularized logistic matrix factorization, which can optimize a lot more than using an algorithm alone. Moreover, the method is semi-supervised and does not require negative samples. Based on the leave-one-out cross validation, we obtain the AUC of 0.9150 and the AUPR of 0.7138, demonstrating its reliable performance. In addition, by means of case study in the “Mus musculus,” many lncRNA-protein interactions which are predicted by our method can be successfully confirmed by experiments. This suggests that IRWNRLPI will be a useful bioinformatics resource in biomedical research.
A heavy dust fell in Beijing on April [16][17] 2006. The dust storm formed in middle and western Inner Mongolia, China, under a strong Mongolian Cyclone. During the dust fall, the near-surface wind speed was insignificant in Beijing. The minimal wind speed at ground level indicates that the dust must have been transported by upper northwestern winds, and the local dust of Beijing contributed little to dust fall. The lack of a contribution from local dust differed from previous dust fall events in Beijing. Therefore, this dust fall provides good information on the dust source. Dust samples colleted during this event and Pleistocene loess samples from the Beijing area were analyzed for magnetic susceptibility, bulk particle and quartz grain size distributions, bulk particle and quartz micro-textures, mineralogy, carbonate content, major element concentrations, trace element and rare earth element (REE) concentrations, and the oxygen isotope composition of quartz. The results indicate the following. (1)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.