Protein-protein interactions (PPIs) play a vital role in the biological processes involved in the cell functions and disease pathways. The experimental methods known to predict PPIs require tremendous efforts and the results are often hindered by the presence of a large number of false positives. Herein, we demonstrate the use of a new Genetic Programming (GP) based Symbolic Regression (SR) approach for predicting PPIs related to a disease. In a case study, a dataset consisting of one hundred and thirty five PPI complexes related to cancer was used to construct a generic PPI predicting model with good PPI prediction accuracy and generalization ability. A high correlation coefficient(CC) of 0.893, low root mean square error (RMSE) and mean absolute percentage error (MAPE) values of 478.221 and 0.239, respectively were achieved for both the training and test set outputs. To validate the discriminatory nature of the model, it was applied on a dataset of diabetes complexes where it yielded significantly low CC values. Thus, the GP model developed here serves a dual purpose: (a)a predictor of the binding energy of cancer related PPI complexes, and (b)a classifier for discriminating PPI complexes related to cancer from those of other diseases.
Summary
The °API value is an important physicochemical characteristic of crude oils often used in determining their properties and quality. There exist models—predominantly linear ones—for predicting the °API magnitude from the molecular composition of a crude oil. This approach is tedious and time-consuming because it requires quantitative determination of numerous crude-oil components. Usually, the hydrocarbons present in a crude oil are grouped according to their molecular average structures into saturates, aromatics, resins, and asphaltenes (SARA) fractions. An °API-value prediction model dependent on these four fractions is relatively easier to develop, although this approach has been rarely used. A rigorous scrutiny suggests that some of the dependencies between the individual SARA fractions and the corresponding °API value could be nonlinear. Accordingly, in this study, SARA-fraction-based nonlinear models have been developed for the prediction of values using three computational-intelligence (CI) formalisms: genetic programming (GP), artificial-neural networks (ANNs), and support-vector regression (SVR). The SARA analyses and °API values of 403 crude-oil samples covering wide ranges have been used in developing these models. A comparison of the CI-based models with an existing linear model indicates that all the former class of models possess a significantly better °API-value prediction and generalization performance than those exhibited by the linear model. Also, the SVR-based model has been found to be the most accurate °API-value predictor. Because of their better prediction accuracy, CI-based models can be gainfully used to predict °API values of crude oils.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.