Many risk-sensitive applications require Machine Learning (ML) models to be interpretable. Attempts to obtain interpretable models typically rely on tuning, by trial-and-error, hyper-parameters of model complexity that are only loosely related to interpretability. We show that it is instead possible to take a meta-learning approach: an ML model of non-trivial Proxies of Human Interpretability (PHIs) can be learned from human feedback, then this model can be incorporated within an ML training process to directly optimize for interpretability. We show this for evolutionary symbolic regression. We first design and distribute a survey finalized at finding a link between features of mathematical formulas and two established PHIs, simulatability and decomposability. Next, we use the resulting dataset to learn an ML model of interpretability. Lastly, we query this model to estimate the interpretability of evolving solutions within bi-objective genetic programming. We perform experiments on five synthetic and eight real-world symbolic regression problems, comparing to the traditional use of solution size minimization. The results show that the use of our model leads to formulas that are, for a same level of accuracy-interpretability trade-off, either significantly more or equally accurate. Moreover, the formulas are also arguably more interpretable. Given the very positive results, we believe that our approach represents an important stepping stone for the design of next-generation interpretable (evolutionary) ML algorithms.
Figure 1: Schematic view of the proposed approach, ML-PIE. In the implementation proposed in this paper, the user provides feedback on models that are being discovered by an evolutionary algorithm. This feedback is used to train an estimator which, in turn, shapes one of the objective functions used by the evolution. Ultimately, this steers the evolution towards discovering models that are interpretable according to the specific user. To minimize the amount of feedback needed, ML-PIE keeps track of which models cause the estimator to be most uncertain, and submits these models for user assessment.
Figure 1: Schematic view of the proposed approach, ML-PIE. In the implementation proposed in this paper, the user provides feedback on models that are being discovered by an evolutionary algorithm. This feedback is used to train an estimator which, in turn, shapes one of the objective functions used by the evolution. Ultimately, this steers the evolution towards discovering models that are interpretable according to the specific user. To minimize the amount of feedback needed, ML-PIE keeps track of which models cause the estimator to be most uncertain, and submits these models for user assessment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.