“…But the adoption of these models in safety-critical or high-security applications is inhibited due to two major concerns: their brittleness to adversarial attack methods that can make imperceptible modification to inputs and trigger wrong decisions (Szegedy et al, 2013;Papernot et al, 2016a), and the lack of interpretability (Gunning, 2017). Significant progress has also been made towards adversarial robustness (Papernot et al, 2016b;Madry et al, 2017;Engstrom et al, 2018) and explainability (Li & Yu, 2015;Yi et al, 2016;Sundararajan et al, 2017), and a few recent theoretical studies (Kilbertus et al, 2018;Chalasani et al, 2018) indicate a strong connection between these two issues.…”