2023
DOI: 10.1145/3542921
|View full text |Cite
|
Sign up to set email alerts
|

What Did My AI Learn? How Data Scientists Make Sense of Model Behavior

Abstract: Data scientists require rich mental models of how AI systems behave to effectively train, debug, and work with them. Despite the prevalence of AI analysis tools, there is no general theory describing how people make sense of what their models have learned. We frame this process as a form of sensemaking and derive a framework describing how data scientists develop mental models of AI behavior. To evaluate the framework, we show how existing AI analysis tools fit into this sensemaking process and use it to desig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
12
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 73 publications
1
12
0
Order By: Relevance
“…Detecting key phrases in demonstrations. While key phrase extraction in general may require domain knowledge [8,42,65], for text transformation we can leverage the signal present in the relationships between input and output, i.e., in which parts of the input are perturbed or retained. For example, "today" is retained in the output of both "Took a photo today. "…”
Section: Identifying Patterns With Key Phrase Clusteringmentioning
confidence: 99%
“…Detecting key phrases in demonstrations. While key phrase extraction in general may require domain knowledge [8,42,65], for text transformation we can leverage the signal present in the relationships between input and output, i.e., in which parts of the input are perturbed or retained. For example, "today" is retained in the output of both "Took a photo today. "…”
Section: Identifying Patterns With Key Phrase Clusteringmentioning
confidence: 99%
“…To detect and mitigate these important issues, the ML community uses more fine-grained evaluation approaches, often termed behavioral evaluation [10,47]. Inspired by requirements engineering in software engineering, behavioral evaluation focuses on defining and testing the capabilities of an ML system, its expected behavior on a specification of requirements [45,60].…”
Section: Behavioral Evaluation Of Machine Learningmentioning
confidence: 99%
“…There are numerous ML evaluation systems for discovering, validating, and tracking model behaviors [10,47]. The tools use techniques such as visualizations and data transformations to discover behaviors like fairness concerns and edge cases.…”
Section: Model Evaluation Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…Unlike existing work, our study proposes an interactive workflow of exploring concepts for the purpose of inspecting systematic errors and spurious concept associations behind them. Similar to [11], our human-in-the-loop workflow aims to promote the sensemaking of practitioners specifically in the problem of systematic errors where they can iteratively work on subsetting, contrasting patterns in instances, and hypothesizing spurious associations.…”
Section: Understanding Model With Concept Interpretabilitymentioning
confidence: 99%