Pharmacokinetic
(PK) parameters such as clearance (CL) and volume
of distribution (Vd) have been the subject of previous in
silico predictive models. However, having information of
the concentration over time profile explicitly can provide additional
value like time above MIC or AUC, etc., to understand both the efficacy
and safety-related aspects of a compound. In this work, we developed
machine learning models for plasma concentration–time profiles
after both i.v. and p.o. dosing
for a series of 17 in-house projects. For explanatory variables, MACCS
Keys chemical descriptors as well as in silico and
experimental in vitro PK parameters were used. The
predictive accuracy of random forest (RF), message passing neural
network, 2-compartment models using estimated CL and Vdss, and an
average model (as a control experiment) was investigated using 5-fold
cross-validation (5-fold CV) and leave-one-project-out validation
(LOPO-V). The predictive accuracy of RF in 5-fold CV for i.v. and p.o. plasma concentration–time profiles
was the best among the models studied, with an RMSE for i.v. dosing at 0.08, 1, and 8 h of 0.245, 0.474, and 0.462, respectively,
and an RMSE for p.o. dosing at 0.25, 1, and 8 h of
0.500, 0.612, and 0.509, respectively. Furthermore, by investigating
the importance of the in vitro PK parameters using
the Gini index, we observed that the general prior knowledge in ADME
research was reflected well in the respective feature importance of in vitro parameters such as predicted human Vd (hVd) for
the initial distribution, mouse intrinsic CL and unbound fraction
of mouse plasma for the elimination process, and Caco2 permeability
for the absorption process. Also, this model is the first model that
can predict twin peaks in the concentration–time profile much
better than a baseline compartment model. Because of its combination
of sufficient accuracy and speed of prediction, we found the model
to be fit-for-purpose for practical lead optimization.