Genetic programming (GP) is used to extract from rat oral bioavailability (OB) measurements simple, interpretable and predictive QSAR models which both generalise to rats and to marketed drugs in humans. Receiver Operating Characteristics (ROC) curves for the binary classifier produced by machine learning show no statistical difference between rats (albeit without known clearance differences) and man. Thus evolutionary computing offers the prospect of in silico ADME screening, e.g. for "virtual" chemicals, for pharmaceutical drug discovery.
Genetic programming (GP) offers a generic method of automatically fusing together classifiers using their receiver operating characteristics (ROC) to yield superior ensembles. We combine decision trees (C4.5) and artificial neural networks (ANN) on a difficult pharmaceutical data mining (KDD) drug discovery application. Specifically predicting inhibition of a P450 enzyme. Training data came from high throughput screening (HTS) runs. The evolved model may be used to predict behaviour of virtual (i.e. yet to be manufactured) chemicals. Measures to reduce over fitting are also described.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.