Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) 2022
DOI: 10.18653/v1/2022.tsar-1.18
|View full text |Cite
|
Sign up to set email alerts
|

A Benchmark for Neural Readability Assessment of Texts in Spanish

Laura Vásquez-Rodríguez,
Pedro-Manuel Cuenca-Jiménez,
Sergio Morales-Esquivel
et al.

Abstract: We release a new benchmark for Automated Readability Assessment (ARA) of texts in Spanish. We combined existing corpora with suitable texts collected from the Web, thus creating the largest available dataset for ARA of Spanish texts. All data was pre-processed and categorised to allow experimenting with ARA models that make predictions at two (simple and complex) or three (basic, intermediate, and advanced) readability levels, and at two text granularities (paragraphs and sentences). An analysis based on reada… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 25 publications
0
0
0
Order By: Relevance
“…Although these features are useful, they represent an impoverished view of readability. Recent years have seen a marked increase in interest in research surrounding machine-learning approaches to secondlanguage readability, with many studies focused on commonly taught languages, such as English Xu et al, 2015;Xia et al, 2016;Vajjala & Lučić, 2018), German , Russian , French (Lee & Vajjala, 2022), Italian (Azpiazu & Pera, 2019), and Spanish (Vásquez- Rodríguez et al, 2022). In addition to studying the readability of individual languages, some researchers have worked to identify language-agnostic properties that can be used for multilingual readability classification (Azpiazu & Pera, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Although these features are useful, they represent an impoverished view of readability. Recent years have seen a marked increase in interest in research surrounding machine-learning approaches to secondlanguage readability, with many studies focused on commonly taught languages, such as English Xu et al, 2015;Xia et al, 2016;Vajjala & Lučić, 2018), German , Russian , French (Lee & Vajjala, 2022), Italian (Azpiazu & Pera, 2019), and Spanish (Vásquez- Rodríguez et al, 2022). In addition to studying the readability of individual languages, some researchers have worked to identify language-agnostic properties that can be used for multilingual readability classification (Azpiazu & Pera, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Although these features are useful, they represent an impoverished view of readability. Recent years have seen a marked increase in interest in research surrounding machine-learning approaches to secondlanguage readability, with many studies focused on commonly taught languages, such as English Xu et al, 2015;Xia et al, 2016;Vajjala & Lučić, 2018), German (Hancke et al, 2012), Russian (Reynolds, 2016), French (Lee & Vajjala, 2022), Italian (Azpiazu & Pera, 2019), and Spanish (Vásquez-Rodríguez et al, 2022). In addition to studying the readability of individual languages, some researchers have worked to identify language-agnostic properties that can be used for multilingual readability classification (Azpiazu & Pera, 2019).…”
Section: Introductionmentioning
confidence: 99%