Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research &Amp; Applications 2016
DOI: 10.1145/2857491.2857542
|View full text |Cite
|
Sign up to set email alerts
|

Fusing eye movements and observer narratives for expert-driven image-region annotations

Abstract: Human image understanding is reflected by individuals' visual and linguistic behaviors, but the meaningful computational integration and interpretation of their multimodal representations remain a challenge. In this paper, we expand a framework for capturing image-region annotations in dermatology, a domain in which interpreting an image is influenced by experts' visual perception skills, conceptual domain knowledge, and task-oriented goals. Our work explores the hypothesis that eye movements can help us under… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(14 citation statements)
references
References 35 publications
0
14
0
Order By: Relevance
“…In this section, we describe our multimodal Spoken Narratives And Gaze ( snag ) dataset ( Vaidyanathan et al., 2018 ) that is used to evaluate the proposed framework. This dataset contains eye movements and spoken narratives co-captured from participants while viewing general domain images ( Figure 3) and has been released 2 to the research community.…”
Section: Multimodal Data Collectionmentioning
confidence: 99%
See 2 more Smart Citations
“…In this section, we describe our multimodal Spoken Narratives And Gaze ( snag ) dataset ( Vaidyanathan et al., 2018 ) that is used to evaluate the proposed framework. This dataset contains eye movements and spoken narratives co-captured from participants while viewing general domain images ( Figure 3) and has been released 2 to the research community.…”
Section: Multimodal Data Collectionmentioning
confidence: 99%
“…Reference alignments (ground truth) were prepared using a GUI called RegionLabeler 5 ( Vaidyanathan et al., 2018 ) to allow evaluation of the resulting multimodal alignments. This represented the manual alignments obtained by associating each fixation cluster in the case of MFSC and image segment in the case of image segmentation with its corresponding word tokens (linguistic units).…”
Section: Alignmentmentioning
confidence: 99%
See 1 more Smart Citation
“…We examine the usefulness of our general-domain dataset on image-region annotation, adapting the framework given by Vaidyanathan et al (2016).…”
Section: Application To Multimodal Alignmentmentioning
confidence: 99%
“…Ho et al (2015) provide a dataset that consists only of gaze and speech time stamps during dyadic interactions. The closest dataset to ours is the multimodal but non-public data described by Vaidyanathan et al (2016).…”
Section: Related Workmentioning
confidence: 99%