Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics 2021
DOI: 10.18653/v1/2021.cmcl-1.10
|View full text |Cite
|
Sign up to set email alerts
|

LAST at CMCL 2021 Shared Task: Predicting Gaze Data During Reading with a Gradient Boosting Decision Tree Approach

Abstract: A LightGBM model fed with target word lexical characteristics and features obtained from word frequency lists, psychometric data and bigram association measures has been optimized for the 2021 CMCL Shared Task on Eye-Tracking Data Prediction. It obtained the best performance of all teams on two of the five eye-tracking measures to predict, allowing it to rank first on the official challenge criterion and to outperform all deep-learning based systems participating in the challenge.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 21 publications
(16 reference statements)
0
4
0
Order By: Relevance
“…• A much slower and more complex approach to optimize because it requires the optimization of many parameters, but that has recently outperformed all deep-learning based systems participating in the CMCL 2021 shared task on predicting gaze data during reading (Bestgen, 2021a): a gradient boosting decision tree approach as implemented in the LightGBM free software (Ke et al, 2017). This approach has been used only in a second time.…”
Section: Proposed Systemmentioning
confidence: 99%
“…• A much slower and more complex approach to optimize because it requires the optimization of many parameters, but that has recently outperformed all deep-learning based systems participating in the CMCL 2021 shared task on predicting gaze data during reading (Bestgen, 2021a): a gradient boosting decision tree approach as implemented in the LightGBM free software (Ke et al, 2017). This approach has been used only in a second time.…”
Section: Proposed Systemmentioning
confidence: 99%
“…More specifically, these authors train models such as logistic regression or conditional random fields on a corpus of human eye-tracking data, and then predict fixation time, skipping rate, and other eye-movement measures on an unseen test set. Unlike our goals here, the literature in this tradition (e.g., Bestgen, 2021;Hara, Kano, & Aizawa, 2012;Hollenstein et al, 2021;Matthies & Søgaard, 2013;Nilsson & Nivre, 2009 does not primarily aim to construct explanatory cognitive models, and the use of supervised training (i.e., models learn their behavior from a pre-existing training set of eye-movement data) is not psychologically realistic, as outlined in the introduction. However, machine learning-based models typically achieve good prediction accuracy, which makes them suitable for comparison with cognitive models.…”
Section: Models Of Eye-movement Controlmentioning
confidence: 99%
“…For the present benchmark, this is possible by leveraging a naturalistic dataset of reading English sentences, the Zurich Cognitive Language Processing Corpus (Hollenstein et al, 2018 , 2020 ). The ZuCo dataset is publicly available and has recently been used in a variety of applications including leveraging EEG and eye-tracking data to improve NLP tasks (Barrett et al, 2018 ; Mathias et al, 2020 ; McGuire and Tomuro, 2021 ), evaluating the cognitive plausibility of computational language models (Hollenstein et al, 2019b ; Hollenstein and Beinborn, 2021 ), investigating the neural dynamics of reading (Pfeiffer et al, 2020 ), developing models of human reading (Bautista and Naval, 2020 ; Bestgen, 2021 ).…”
Section: Introductionmentioning
confidence: 99%