Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1109
|View full text |Cite
|
Sign up to set email alerts
|

Complex Word Identification as a Sequence Labelling Task

Abstract: Complex Word Identification (CWI) is concerned with detection of words in need of simplification and is a crucial first step in a simplification pipeline. It has been shown that reliable CWI systems considerably improve text simplification. However, most CWI systems to date address the task on a word-byword basis, not taking the context into account. In this paper, we present a novel approach to CWI based on sequence modelling. Our system is capable of performing CWI in context, does not require extensive feat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
37
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 37 publications
(37 citation statements)
references
References 16 publications
0
37
0
Order By: Relevance
“…We refer to this model as HLR+. The features include word complexity scores estimated by a pre-trained model [6], mean concreteness scores and percent known based on human judgements [2], SUBTLEX word frequencies [18] and user ids.…”
Section: Hlr With Linguistic/psychological Features (Hlr+)mentioning
confidence: 99%
“…We refer to this model as HLR+. The features include word complexity scores estimated by a pre-trained model [6], mean concreteness scores and percent known based on human judgements [2], SUBTLEX word frequencies [18] and user ids.…”
Section: Hlr With Linguistic/psychological Features (Hlr+)mentioning
confidence: 99%
“…The strategy of annotated carried out, software with free tools were built, which was called EIL (Language Research Environment), where the texts of the VYTEDU corpus were loaded. The annotated process takes into consideration the research work proposed by [13,9,16,17] on the lexical simplification project for Spanish and the lexical simplification for Czech, respectively.…”
Section: Fig 1 Vytedu Corpus Textsmentioning
confidence: 99%
“…These research papers have attracted attention in recent years, with the advent of deep learning approaches [16] and multilingual challenges [17], which contributes to the evaluation of words labeled as severe, such as specialized words, common lexicon words, slang, English words, acronyms, among others. Given these terms, students had a hard time understanding; in some cases, they ignored its meaning or had some idea or notion of it.…”
Section: B Analysis Of Corpus Vytedu-cw By Carreramentioning
confidence: 99%
“…The proposed SEQ model (Gooding and Kochmar, 2019) has a number of additional advantages: it takes context into account, helps avoid the necessity of extensive feature engineering relying on word embeddings as the only input information at run time, and generalises well across all three datasets. To further assess generalisability of the model, we test it on CEFR-LS, as well as BENCHLS for consistency (see Table 2).…”
Section: Test Setmentioning
confidence: 99%