2021
DOI: 10.48550/arxiv.2105.04505
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Benchmarking the Utility of Explanations for Model Debugging

Abstract: Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model's decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highligh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 3 publications
(3 reference statements)
0
3
0
Order By: Relevance
“…The second piece of related work is regarding debugging ML models that have been used in text classification [21], search [27,29,31], and many language tasks in general [6,9] by using explanations or interpretable machine learning approaches called explanationbased human debugging (EBHD). Lertvittayakumjorn and Toni [12] recently review EBHD approaches that exploit explanations to enable humans to give feedback and debug NLP models.…”
Section: Related Workmentioning
confidence: 99%
“…The second piece of related work is regarding debugging ML models that have been used in text classification [21], search [27,29,31], and many language tasks in general [6,9] by using explanations or interpretable machine learning approaches called explanationbased human debugging (EBHD). Lertvittayakumjorn and Toni [12] recently review EBHD approaches that exploit explanations to enable humans to give feedback and debug NLP models.…”
Section: Related Workmentioning
confidence: 99%
“…The prediction model can still perform well even if the attention weights don't correlate with the (sub-)token weight as desired by humans. Finally, there has been recent work on devising decoy datasets to measure utility of explanations methods for NLP models [25]. Our approach for rationale based explanations differs in the type of architectures, objectives, and general nature of its utility.…”
Section: Select-and-predict Modelsmentioning
confidence: 99%
“…If the model is incorrect in its assessment, it can be provided with a corrective input such as "The keywords 'great' and 'amazing' are important cues in predicting the sentiment of this sentence" where the keywords themselves are automatically identified by a post hoc explanation method. While post hoc explanations have generally been considered valuable tools for deepening our understanding of model behavior [11] and for identifying root causes of errors made by ML models [12,13], our work is the first to explore their utility in improving the performance of LLMs.…”
Section: Introductionmentioning
confidence: 99%