2019
DOI: 10.48550/arxiv.1907.11779
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Supervised and Unsupervised Neural Approaches to Text Readability

Matej Martinc,
Senja Pollak,
Marko Robnik-Šikonja

Abstract: We present a set of novel neural supervised and unsupervised approaches for determining readability of documents. In the unsupervised setting, we leverage neural language models, while in the supervised setting three different neural architectures are tested in the classification setting. We show that the proposed neural unsupervised approach on average produces better results than traditional readability formulas and is transferable across languages. Employing neural classifiers, we outperform current state-o… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 21 publications
0
1
0
Order By: Relevance
“…We compare our result to a stateof-the-art readability system by Vajjala and Lučić (2018) referred to as VAJJALA-2018. This system uses a multilayer perceptron classifier and has been shown to outperform BERT-based approaches on the OneStopEnglish dataset (Martinc et al, 2019). The system relies on 155 hand-crafted features which are grouped into six categories: traditional metrics, word features, psycholinguistic, lexical richness, syntactic and discourse features.…”
Section: Predicting Readabilitymentioning
confidence: 99%
“…We compare our result to a stateof-the-art readability system by Vajjala and Lučić (2018) referred to as VAJJALA-2018. This system uses a multilayer perceptron classifier and has been shown to outperform BERT-based approaches on the OneStopEnglish dataset (Martinc et al, 2019). The system relies on 155 hand-crafted features which are grouped into six categories: traditional metrics, word features, psycholinguistic, lexical richness, syntactic and discourse features.…”
Section: Predicting Readabilitymentioning
confidence: 99%