“…It is possible to empirically evaluate the Scorecard levels using methods and metrics that have been tailored to XAI evaluation (see . The work of Wang et al (2022) (see above) included an evaluation by experts, indicating that the generation of explorable explanations was successful and the visual elements included with the spatial temporal feature selection and querying had explanatory value.…”
Section: Discussionmentioning
confidence: 99%
“…None of the systems was scored at Level 5 (Diagnosis of failures). The system described by Wang et al (2022) achieved Level 5 but was scored at Level 6 by default, that is, because it achieved that higher Level. We suspect that there are more XAI systems now that would be scorable at Level 5.…”
IntroductionMany Explainable AI (XAI) systems provide explanations that are just clues or hints about the computational models-Such things as feature lists, decision trees, or saliency images. However, a user might want answers to deeper questions such as How does it work?, Why did it do that instead of something else? What things can it get wrong? How might XAI system developers evaluate existing XAI systems with regard to the depth of support they provide for the user's sensemaking? How might XAI system developers shape new XAI systems so as to support the user's sensemaking? What might be a useful conceptual terminology to assist developers in approaching this challenge?MethodBased on cognitive theory, a scale was developed reflecting depth of explanation, that is, the degree to which explanations support the user's sensemaking. The seven levels of this scale form the Explanation Scorecard.Results and discussionThe Scorecard was utilized in an analysis of recent literature, showing that many systems still present low-level explanations. The Scorecard can be used by developers to conceptualize how they might extend their machine-generated explanations to support the user in developing a mental model that instills appropriate trust and reliance. The article concludes with recommendations for how XAI systems can be improved with regard to the cognitive considerations, and recommendations regarding the manner in which results on the evaluation of XAI systems are reported.
“…It is possible to empirically evaluate the Scorecard levels using methods and metrics that have been tailored to XAI evaluation (see . The work of Wang et al (2022) (see above) included an evaluation by experts, indicating that the generation of explorable explanations was successful and the visual elements included with the spatial temporal feature selection and querying had explanatory value.…”
Section: Discussionmentioning
confidence: 99%
“…None of the systems was scored at Level 5 (Diagnosis of failures). The system described by Wang et al (2022) achieved Level 5 but was scored at Level 6 by default, that is, because it achieved that higher Level. We suspect that there are more XAI systems now that would be scorable at Level 5.…”
IntroductionMany Explainable AI (XAI) systems provide explanations that are just clues or hints about the computational models-Such things as feature lists, decision trees, or saliency images. However, a user might want answers to deeper questions such as How does it work?, Why did it do that instead of something else? What things can it get wrong? How might XAI system developers evaluate existing XAI systems with regard to the depth of support they provide for the user's sensemaking? How might XAI system developers shape new XAI systems so as to support the user's sensemaking? What might be a useful conceptual terminology to assist developers in approaching this challenge?MethodBased on cognitive theory, a scale was developed reflecting depth of explanation, that is, the degree to which explanations support the user's sensemaking. The seven levels of this scale form the Explanation Scorecard.Results and discussionThe Scorecard was utilized in an analysis of recent literature, showing that many systems still present low-level explanations. The Scorecard can be used by developers to conceptualize how they might extend their machine-generated explanations to support the user in developing a mental model that instills appropriate trust and reliance. The article concludes with recommendations for how XAI systems can be improved with regard to the cognitive considerations, and recommendations regarding the manner in which results on the evaluation of XAI systems are reported.
“…As for the XAI system for NLP, Li et al [45] provided a unified interpretive method for interpreting NLP models for text classification. Attempts have also been made in the broaderer application scenarios of AI, such as healthcare [9] and autonomous driving [28], [83].…”
Section: Visual Explanation For Machine Learningmentioning
How to cite:Please refer to published version for the most recent bibliographic citation information. If a published version is known of, the repository item page linked to above, will contain details on accessing it.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.