Background
Accurate prediction of
in vivo
toxicity from
in vitro
testing is a challenging problem. Large public–private consortia have been formed with the goal of improving chemical safety assessment by the means of high-throughput screening.
Objective
A wealth of available biological data requires new computational approaches to link chemical structure,
in vitro
data, and potential adverse health effects.
Methods and results
A database containing experimental cytotoxicity values for
in vitro
half-maximal inhibitory concentration (IC
50
) and
in vivo
rodent median lethal dose (LD
50
) for more than 300 chemicals was compiled by Zentralstelle zur Erfassung und Bewertung von Ersatz- und Ergaenzungsmethoden zum Tierversuch (ZEBET; National Center for Documentation and Evaluation of Alternative Methods to Animal Experiments). The application of conventional quantitative structure–activity relationship (QSAR) modeling approaches to predict mouse or rat acute LD
50
values from chemical descriptors of ZEBET compounds yielded no statistically significant models. The analysis of these data showed no significant correlation between IC
50
and LD
50
. However, a linear IC
50
versus LD
50
correlation could be established for a fraction of compounds. To capitalize on this observation, we developed a novel two-step modeling approach as follows. First, all chemicals are partitioned into two groups based on the relationship between IC
50
and LD
50
values: One group comprises compounds with linear IC
50
versus LD
50
relationships, and another group comprises the remaining compounds. Second, we built conventional binary classification QSAR models to predict the group affiliation based on chemical descriptors only. Third, we developed
k
-nearest neighbor continuous QSAR models for each subclass to predict LD
50
values from chemical descriptors. All models were extensively validated using special protocols.
Conclusions
The novelty of this modeling approach is that it uses the relationships between
in vivo
and
in vitro
data only to inform the initial construction of the hierarchical two-step QSAR models. Models resulting from this approach employ chemical descriptors only for external prediction of acute rodent toxicity.