Support vector machines (SVMs) were used to develop QSAR models that correlate molecular structures to their toxicity and bioactivities. The performance and predictive ability of SVM are investigated and compared with other methods such as multiple linear regression and radial basis function neural network methods. In the present study, two different data sets were evaluated. The first one involves an application of SVM to the development of a QSAR model for the prediction of toxicities of 153 phenols, and the second investigation deals with the QSAR model between the structures and the activities of a set of 85 cyclooxygenase 2 (COX-2) inhibitors. For each application, the molecular structures were described using either the physicochemical parameters or molecular descriptors. In both studied cases, the predictive ability of the SVM model is comparable or superior to those obtained by MLR and RBFNN. The results indicate that SVM can be used as an alternative powerful modeling tool for QSAR studies.
The support vector machine (SVM), as a novel type of a learning machine, for the first time, was used to develop a QSPR model that relates the structures of 35 amino acids to their isoelectric point. Molecular descriptors calculated from the structure alone were used to represent molecular structures. The seven descriptors selected using GA-PLS, which is a sophisticated hybrid approach that combines GA as a powerful optimization method with PLS as a robust statistical method for variable selection, were used as inputs of RBFNNs and SVM to predict the isoelectric point of an amino acid. The optimal QSPR model developed was based on support vector machines, which showed the following results: the root-mean-square error of 0.2383 and the prediction correlation coefficient R ) 0.9702 were obtained for the whole data set. Satisfactory results indicated that the GA-PLS approach is a very effective method for variable selection, and the support vector machine is a very promising tool for the nonlinear approximation.
A least-squares support vector machine (LSSVM) was used for the first time as a novel machine-learning technique for the prediction of the solubility of C60 in a large number of diverse solvents using calculated molecular descriptors from the molecular structure alone and on the basis of the software CODESSA as inputs. The heuristic method of CODESSA was used to select the correlated descriptors and build the linear model. Both the linear and the nonlinear models can give very satisfactory prediction results: the square of the correlation coefficient R(2) was 0.892 and 0.903, and the root-mean-square error was 0.126 and 0.116, respectively, for the whole data set. The prediction result of the LSSVM model is better than that obtained by the heuristic method and the reference, which proved LSSVM was a useful tool in the prediction of the solubility of C60. In addition, this paper provided a new and effective method for predicting the solubility of C60 from its structures and gave some insight into the structural features related to the solubility of C60 in different solvents.
Surfaces obtained by modifying poly(N,N'-dimethylaminoethyl methacrylate) (PDMAEMA) on rough silicon substrates are highly hydrophilic at low pH and highly hydrophobic at high pH; such surfaces effectively supplement the research on the wettability of solid surfaces based on the pH-responsive polymers.
The Support Vector Machine (SVM) classification algorithm, recently developed from the machine learning community, was used to diagnose breast cancer. At the same time, the SVM was compared to several machine learning techniques currently used in this field. The classification task involves predicting the state of diseases, using data obtained from the UCI machine learning repository. SVM outperformed k-means cluster and two artificial neural networks on the whole. It can be concluded that nine samples could be mislabeled from the comparison of several machine learning techniques.
The least squares support vector machine (LSSVM), as a novel machine learning algorithm, was used to develop quantitative and classification models as a potential screening mechanism for a novel series of 1,4-dihydropyridine calcium channel antagonists for the first time. Each compound was represented by calculated structural descriptors that encode constitutional, topological, geometrical, electrostatic, quantum-chemical features. The heuristic method was then used to search the descriptor space and select the descriptors responsible for activity. Quantitative modeling results in a nonlinear, seven-descriptor model based on LSSVM with mean-square errors 0.2593, a predicted correlation coefficient (R(2)) 0.8696, and a cross-validated correlation coefficient (R(cv)(2)) 0.8167. The best classification results are found using LSSVM: the percentage (%) of correct prediction based on leave one out cross-validation was 91.1%. This paper provides a new and effective method for drug design and screening.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.