Findings of the Association for Computational Linguistics: NAACL 2022 2022
DOI: 10.18653/v1/2022.findings-naacl.27
|View full text |Cite
|
Sign up to set email alerts
|

One Size Does Not Fit All: The Case for Personalised Word Complexity Models

Abstract: Complex Word Identification (CWI) aims to detect words within a text that a reader may find difficult to understand. It has been shown that CWI systems can improve text simplification, readability prediction and vocabulary acquisition modelling. However, the difficulty of a word is a highly idiosyncratic notion that depends on a reader's first language, proficiency and reading experience. In this paper, we show that personal models are best when predicting word complexity for individual readers. We use a novel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 25 publications
0
6
0
Order By: Relevance
“…In the meantime, research on lexical complexity classification at the word level by Gooding and Tragut [10] and Balyan et al, [11] has transformed the binary classification into a more thorough multi-class classification. The study by Gooding and Tragut [10] has classified the words into seven classes based on the English Common European Framework of Reference for Languages (CEFR), while the study by Balyan et al, [11] has classified the words into four classes: easy, medium, difficult, and very difficult.…”
Section: Text Complexity Classification Approachmentioning
confidence: 99%
See 3 more Smart Citations
“…In the meantime, research on lexical complexity classification at the word level by Gooding and Tragut [10] and Balyan et al, [11] has transformed the binary classification into a more thorough multi-class classification. The study by Gooding and Tragut [10] has classified the words into seven classes based on the English Common European Framework of Reference for Languages (CEFR), while the study by Balyan et al, [11] has classified the words into four classes: easy, medium, difficult, and very difficult.…”
Section: Text Complexity Classification Approachmentioning
confidence: 99%
“…In the meantime, research on lexical complexity classification at the word level by Gooding and Tragut [10] and Balyan et al, [11] has transformed the binary classification into a more thorough multi-class classification. The study by Gooding and Tragut [10] has classified the words into seven classes based on the English Common European Framework of Reference for Languages (CEFR), while the study by Balyan et al, [11] has classified the words into four classes: easy, medium, difficult, and very difficult. In terms of methodologies, Gooding and Tragut [10] implemented the unsupervised active learning agglomerative clustering method, which performed the clustering from the bottom up, whereas Balyan et al, [11] strongly utilized the Machine Learning approach coupled with the NLP.…”
Section: Text Complexity Classification Approachmentioning
confidence: 99%
See 2 more Smart Citations
“…Pilán et al ( 2017) identify a number of criteria for good seed sentences, including well-formedness, context independence, linguistic complexity and additional structural and lexical criteria. While we address most of the structural criteria, such as negated or interrogative contexts, with the parameters exposed to users, we deliberately do not restrict seed sentence selection based on lexical criteria, which are often user-dependent and better targeted by a macro-adaptive algorithm in the target ILTS (Gooding and Tragut, 2022). Compliance with context independence will be more likely when the co-text option is activated and can be addressed manually in the subsequent workflow step.…”
Section: Seed Sentence Selectionmentioning
confidence: 99%