When, Where and How Does it Fail? A Spatial-Temporal Visual Analytics Approach for Interpretable Object Detection in Autonomous Driving

Wang, Junhong; Li, Yun; Zhou, Zhaoyu; Wang, Chengshun; Hou, Yijie; Zhang, Li; Xue, Xiangyang; Kamp, Michael; Zhang, Xiaolong; Chen, S.

doi:10.1109/tvcg.2022.3201101

Cited by 14 publications

(3 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is possible to empirically evaluate the Scorecard levels using methods and metrics that have been tailored to XAI evaluation (see . The work of Wang et al (2022) (see above) included an evaluation by experts, indicating that the generation of explorable explanations was successful and the visual elements included with the spatial temporal feature selection and querying had explanatory value.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Evaluating machine-generated explanations: a “Scorecard” method for XAI measurement science

Hoffman

Jalaeian

Tate

et al. 2023

Front. Comput. Sci.

View full text Add to dashboard Cite

IntroductionMany Explainable AI (XAI) systems provide explanations that are just clues or hints about the computational models-Such things as feature lists, decision trees, or saliency images. However, a user might want answers to deeper questions such as How does it work?, Why did it do that instead of something else? What things can it get wrong? How might XAI system developers evaluate existing XAI systems with regard to the depth of support they provide for the user's sensemaking? How might XAI system developers shape new XAI systems so as to support the user's sensemaking? What might be a useful conceptual terminology to assist developers in approaching this challenge?MethodBased on cognitive theory, a scale was developed reflecting depth of explanation, that is, the degree to which explanations support the user's sensemaking. The seven levels of this scale form the Explanation Scorecard.Results and discussionThe Scorecard was utilized in an analysis of recent literature, showing that many systems still present low-level explanations. The Scorecard can be used by developers to conceptualize how they might extend their machine-generated explanations to support the user in developing a mental model that instills appropriate trust and reliance. The article concludes with recommendations for how XAI systems can be improved with regard to the cognitive considerations, and recommendations regarding the manner in which results on the evaluation of XAI systems are reported.

show abstract

Section: Discussionmentioning

confidence: 99%

“…None of the systems was scored at Level 5 (Diagnosis of failures). The system described by Wang et al (2022) achieved Level 5 but was scored at Level 6 by default, that is, because it achieved that higher Level. We suspect that there are more XAI systems now that would be scorable at Level 5.…”

Section: Limitation Of the Methodologymentioning

confidence: 99%

Evaluating machine-generated explanations: a “Scorecard” method for XAI measurement science

Hoffman

Jalaeian

Tate

et al. 2023

Front. Comput. Sci.

View full text Add to dashboard Cite

show abstract

“…As for the XAI system for NLP, Li et al [45] provided a unified interpretive method for interpreting NLP models for text classification. Attempts have also been made in the broaderer application scenarios of AI, such as healthcare [9] and autonomous driving [28], [83].…”

Section: Visual Explanation For Machine Learningmentioning

confidence: 99%