Unmasking Clever Hans predictors and assessing what machines really learn

Lapuschkin, Sebastian; Wäldchen, Stephan; Binder, Alexander; Montavon, Grégoire; Samek, Wojciech; Müller, Klaus‐Robert

doi:10.1038/s41467-019-08987-4

Cited by 808 publications

(677 citation statements)

References 111 publications

Supporting

Mentioning

661

Contrasting

Unclassified

Order By: Relevance

“…Recently, the increasing popularity of explainable AI methods (see e.g. [141,142,143,144]) have allowed us to gain insight into the inner workings of deep learning algorithms. In this manner, it has become possible to extract how a problem is solved by the deep model.…”

Section: Explainable Aimentioning

confidence: 99%

Machine Learning for Molecular Simulation

Noé

Tkatchenko

Müller

et al. 2020

Annu. Rev. Phys. Chem.

Self Cite

596

476

View full text Add to dashboard Cite

Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for a machine learning revolution and have already been profoundly impacted by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, coarse-grained molecular dynamics, the extraction of free energy surfaces and kinetics and generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into machine learning structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation. 1 arXiv:1911.02792v1 [physics.chem-ph] 7 Nov 2019 complex relationship between the input (the pixels) and the output (the labels) that is unknown in its explicit form but can be inferred by a suitable algorithm. Clearly, such an operating principle can be very useful in the description of atomic and molecular systems as well. We know that atomistic configurations dictate the chemical properties, and the machine can learn to associate the latter to the former without solving first principle equations, if presented with enough examples. Although different machine learning tools are available and have been applied to molecular simulation (e.g., kernel methods [2]), here we mostly focus on the use of neural networks, now often synonymously used with the term "deep learning". We assume the reader has basic knowledge of machine learning and we refer to the literature for an introduction to statistical learning theory [3,4] and deep learning [5,6].One of the first applications of machine learning in Chemistry has been to extract classical potential energy surfaces from quantum mechanical (QM) calculations, in order to efficiently perform molecular dynamics (MD) simulations that can incorporate quantum effects. The seminal work of Behler and Parrinello in this direction [7] has opened the way to a now rapidly advancing area of research [8,9,10,11,12,13,14,15]. In addition to atomistic force fields, it has been recently shown that, in the same spirit, effective molecular models at resolution coarser than atomistic can be designed by ML [16,17,18]. Analysis and simulation of MD trajectories has also been affected by ML, for instance for the definition of optimal reaction coordinates [19,20,21,22,23,24], the estimate of free energy surfaces [25,26,27,22], the construction of Markov State Models [21,23,28], and for enhancing MD sampling by learning bias potentials [29,30,31,32,33] or selecting starting configurations by active learning [34,35,36]. Finally, ML can be used to generate samples from the equilibrium distribution of a molecular system without performing MD altogether, as proposed...

show abstract

Section: Explainable Aimentioning

confidence: 99%

Machine Learning for Molecular Simulation

Noé

Tkatchenko

Müller

et al. 2020

Annu. Rev. Phys. Chem.

Self Cite

596

476

View full text Add to dashboard Cite

show abstract

“…erefore, it is important that users can understand when the system will fail. As detecting errors is a claimed utility of instance-level explanations [36,50], we suggest that future work should evaluate this empirically in more detail. Our study design did not allow to draw conclusions in this regard because we did not fully counterbalance the order of tasks and True Negatives (TN) were not part of the task set.…”

Section: E Utility Of Saliency Maps Exists But It Is Limitedmentioning

confidence: 99%

“…In contrast, CNNs look for pa erns in a sub-symbolic fashion that lead to an outcome [7,39]. Because CNNs do not process data in a semantic fashion, other pa erns in an image (which may not belong to the concept) can contribute towards a classi cation outcome in unexpected ways [36]. An implication for the design is that we need to develop explanation algorithms that bridge the gap between humans and machines by leading the user to understand that the system is not basing its classi cation decision on higher-level semantics of the image.…”

Section: Reasoning On Examplesmentioning

confidence: 99%

“…However, very li le work exists that investigates for which instances users should examine salience maps. Researchers have acknowledged that users can only inspect a limited number of saliency maps [50], but to the best of our knowledge, only two works explore sampling strategies [36,50] -none of which where applicable for this work. An important implication, then, is that further research needs to characterise the e ect of di erent sampling strategies of saliency map examples on users interpretation of the system operation.…”

Section: Reasoning On Examplesmentioning

confidence: 99%

See 1 more Smart Citation

Evaluating saliency map explanations for convolutional neural networks

Alqaraawi

Schuessler

Weiß

et al. 2020

Proceedings of the 25th International Conference on Intelligent User Interfaces

117

View full text Add to dashboard Cite

Convolutional neural networks (CNNs) o er great machine learning performance over a range of applications, but their operation is hard to interpret, even for experts. Various explanation algorithms have been proposed to address this issue, yet limited research e ort has been reported concerning their user evaluation. In this paper, we report on an online between-group user study designed to evaluate the performance of "saliency maps" -a popular explanation algorithm for image classi cation applications of CNNs. Our results indicate that saliency maps produced by the LRP algorithm helped participants to learn about some speci c image features the system is sensitive to. However, the maps seem to provide very limited help for participants to anticipate the network's output for new images. Drawing on our ndings, we highlight implications for design and further research on explainable AI. In particular, we argue the HCI and AI communities should look beyond instance-level explanations.

show abstract

“…These performance gains and the resistance to the dimensionality curse are enabled by the hierarchical processing inherent in these multilayer deep networks, which is a biomimetic property common to biological cortical networks (Poggio et al, 2017). However, training these deep networks requires large amounts of labelled data and usually results in a black-box transformation, without any transparent internal mechanisms that would generate insights into the underlying control scheme (reviewed in Lapuschkin et al, 2019). In addition, machine learning solutions often require episodic model retraining (Hermann et al, 2015), and rely on a considerable memory space to store the necessary parameters (Weston et al, 2014).…”

Section: Introductionmentioning

confidence: 99%

Approximating complex musculoskeletal biomechanics using multidimensional autogenerating polynomials

Sobinov

Boots

Gritsenko

et al. 2019

Preprint

View full text Add to dashboard Cite

Computational models of the musculoskeletal system are scientific tools used to study human movement, quantify the effects of injury and disease, and plan surgical interventions. Additionally, these models could also be used to intuitively link biological control signals and realistic high-dimensional articulated prosthetic limbs. However, implementing fast and accurate musculoskeletal computations that can be used to control a prosthetic limb in real-time is a challenging problem. As muscles typically span multiple joints, the wrapping over complex geometrical constraints changes their moment arms and length as a function of joint angle and, thus, their ability to generate joint torques. As a result of these biomechanical complexities, calculating these muscle state variables in real-time is a difficult simulation problem. Here, we report a method to accurately and efficiently calculate these variables for the forearm muscles that actuate the hand and wrist across multiple postures. The posture dependent muscle geometry, moment arms and lengths of modeled muscles, were captured using autogenerating polynomials that expanded their optimal selection of terms using information measurements. The iterative process approximated 33 musculotendon actuators, each spanning up to 6 DOFs in an 18 DOF model of the human arm and hand, defined over the full physiological range of motion. Using these polynomials, the entire forearm anatomy could be computed in <10 µs, which is far better than what is required for real-time performance, and with low errors in moment arms (below 5%) and lengths (below 0.4%). Moreover, we demonstrate that the number of elements in these autogenerating polynomials does not increase exponentially with the increase in complexity of muscles, increasing linearly instead. The similar structure and function of muscles are represented with specific invariant polynomial terms. Dimensionality reduction using the polynomial terms alone resulted in clusters comprised of muscles with similar functions, suggesting that the polynomials themselves captured biologically relevant features of muscle structure and function. We propose that this novel method of describing musculoskeletal biomechanics might further improve the applications of detailed and scalable models for the description of human movement.

show abstract

Unmasking Clever Hans predictors and assessing what machines really learn

Cited by 808 publications

References 111 publications

Machine Learning for Molecular Simulation

Machine Learning for Molecular Simulation

Evaluating saliency map explanations for convolutional neural networks

Approximating complex musculoskeletal biomechanics using multidimensional autogenerating polynomials

Contact Info

Product

Resources

About