2017
DOI: 10.3758/s13428-017-0955-x
|View full text |Cite|
|
Sign up to set email alerts
|

Is human classification by experienced untrained observers a gold standard in fixation detection?

Abstract: Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be due to mistakes of the algorithm. However, little is known about human classification in eye tracking. To what extent do the classifications from a larger group of human coders agree? Twelve experienced but untrain… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
93
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 60 publications
(95 citation statements)
references
References 54 publications
(86 reference statements)
2
93
0
Order By: Relevance
“…O r is the overlap ratio between matched events. l 2 distance between matched event start and end times and their standard deviation l 2 − σ in ms. l 2 and l 2 − σ are similar to RTO and RTD metrics proposed by Hooge et al 44…”
Section: Resultssupporting
confidence: 72%
See 1 more Smart Citation
“…O r is the overlap ratio between matched events. l 2 distance between matched event start and end times and their standard deviation l 2 − σ in ms. l 2 and l 2 − σ are similar to RTO and RTD metrics proposed by Hooge et al 44…”
Section: Resultssupporting
confidence: 72%
“…Evaluating the performance of automated classification systems or human labellers is not straightforward. 46 Sample-level majority vote × × × N/A Event F1 44 Earliest overlapping event × × × low Event kappa 21 Largest overlapping event × low Event error rate 21 N/A × × × N/A ELC Window-based matching × high Table 4. Comparison of event level error metrics on a global basis, thus oblivious to the inherent structure of the data.…”
Section: Error Metricsmentioning
confidence: 99%
“…In Hooge, Niehorster, Nyström, Andersson, and Hessels (2017), event-level fixation detection was assessed by an arguably fairer approach with a set of metrics that includes F1 scores for fixation episodes. We computed these for all three main event types in our data (fixations, saccades, and smooth pursuits): For each event in the ground truth, we look for the earliest algorithmically detected event of the same class that intersects with it.…”
Section: Metricsmentioning
confidence: 99%
“…Automatically detecting different eye movements has been attempted for multiple decades by now, but evaluating the approaches for this task is challenging, not least because of the diversity of the data and the amount of manual labeling required for a meaningful evaluation. To compound this problem, even manual annotations suffer from individual biases and implicitly used thresholds and rules, especially if experts from different sub-areas are involved (Hooge, Niehorster, Mikhail Startsev mikhail.startsev@tum Nyström, Andersson, & Hessels, 2017). For smooth pursuit (SP), even detecting episodes 1 by hand is not entirely trivial (i.e., requires additional information) when the information about their targets is missing.…”
Section: Introductionmentioning
confidence: 99%
“…Previous research has shown that eye-tracking researchers may set different thresholds as to what constitutes a fixation-breaking eye movement [42]. …”
Section: Methodsmentioning
confidence: 99%