Summary
Computational prediction of toxicity has reached new heights as a result of decades of growth in the magnitude and diversity of biological data. Public packages for statistics and machine learning make model creation faster. New theory in machine learning and cheminformatics enables integration of chemical structure, toxicogenomics, simulated and physical data in the prediction of chemical health hazards, and other toxicological information. Our earlier publications have characterized a toxicological dataset of unprecedented scale resulting from the European REACH legislation (Registration Evaluation Authorisation and Restriction of Chemicals). These publications dove into potential use cases for regulatory data and some models for exploiting this data. This article analyzes the options for the identification and categorization of chemicals, moves on to the derivation of descriptive features for chemicals, discusses different kinds of targets modeled in computational toxicology, and ends with a high-level perspective of the algorithms used to create computational toxicology models.