F1 is Not Enough! Models and Evaluation Towards User-Centered Explainable Question Answering

Schuff, Hendrik; Adel, Heike; Vu, Ngoc Thang

doi:10.48550/arxiv.2010.06283

Cited by 1 publication

(2 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On such a topic, while a variety of evaluation methods and approaches have been proposed [63], it is still argued that the best way to assess the interpretability of black-box models is through user experiments and user-centred evaluations as there is no guarantee for the correctness of automated metrics in evaluating explainability [64] and high explainability metric scores do not necessarily reflect high human interpretability in real-world scenarios [64,65]. The same is true for well-known metrics (e.g., F1-score) [66]. Supporting such claims, Fel et al [65] conducted experiments to evaluate the capability of human participants to leverage representative attribution methods to learn to predict the decision of various image classifiers.…”

Section: Evaluation Of Explainability Methods By Means Of Human Knowl...mentioning

confidence: 99%

“…The same approach is applicable to the evaluation of the interpretability of black-box models, i.e., directly understanding the intrinsic explainability of a model [67]. Such evaluations are usually achieved through user questionnaires [66,[68][69][70] whose questions vary depending on the nature of the experiment, model, etc. On the other hand, comparing the interpretability of different explainability methods to choose the best suited one requires the design and implementation of ad hoc human-in-the-loop approaches.…”

Section: Evaluation Of Explainability Methods By Means Of Human Knowl...mentioning

confidence: 99%

See 1 more Smart Citation

The Role of Human Knowledge in Explainable AI

Tocchetti

Brambilla

2022

Data

View full text Add to dashboard Cite

As the performance and complexity of machine learning models have grown significantly over the last years, there has been an increasing need to develop methodologies to describe their behaviour. Such a need has mainly arisen due to the widespread use of black-box models, i.e., high-performing models whose internal logic is challenging to describe and understand. Therefore, the machine learning and AI field is facing a new challenge: making models more explainable through appropriate techniques. The final goal of an explainability method is to faithfully describe the behaviour of a (black-box) model to users who can get a better understanding of its logic, thus increasing the trust and acceptance of the system. Unfortunately, state-of-the-art explainability approaches may not be enough to guarantee the full understandability of explanations from a human perspective. For this reason, human-in-the-loop methods have been widely employed to enhance and/or evaluate explanations of machine learning models. These approaches focus on collecting human knowledge that AI systems can then employ or involving humans to achieve their objectives (e.g., evaluating or improving the system). This article aims to present a literature overview on collecting and employing human knowledge to improve and evaluate the understandability of machine learning models through human-in-the-loop approaches. Furthermore, a discussion on the challenges, state-of-the-art, and future trends in explainability is also provided.

show abstract