Using the crowd for readability prediction

Clercq, Orphée De; Hoste, Véronique; Desmet, Bart; Oosten, Philip van; Cock, Martine De; Macken, Lieve

doi:10.1017/s1351324912000344

Cited by 42 publications

(39 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our results are consistent with the findings of De Clercq et al [17] for generic texts in Dutch. In their study the crowd was presented with pairwise comparisons.…”

Section: Discussionsupporting

confidence: 94%

Measuring text simplification with the crowd

Lasecki

Rello

Bigham

2015

Proceedings of the 12th International Web for All Conference

View full text Add to dashboard Cite

Text can often be complex and difficult to read, especially for peo ple with cognitive impairments or low literacy skills. Text simplifi cation is a process that reduces the complexity of both wording and structure in a sentence, while retaining its meaning. However, this is currently a challenging task for machines, and thus, providing effective on-demand text simplification to those who need it re mains an unsolved problem. Even evaluating the simplicity of text remains a challenging problem for both computers, which cannot understand the meaning of text, and humans, who often struggle to agree on what constitutes a good simplification.This paper focuses on the evaluation of English text simplifica tion using the crowd. We show that leveraging crowds can result in a collective decision that is accurate and converges to a consen sus rating. Our results from 2,500 crowd annotations show that the crowd can effectively rate levels of simplicity. This may allow sim plification systems and system builders to get better feedback about how well content is being simplified, as compared to standard mea sures which classify content into 'simplified' or 'not simplified' categories. Our study provides evidence that the crowd could be used to evaluate English text simplification, as well as to create simplified text in future work.

show abstract

“…Our results are consistent with the findings of De Clercq et al [17] for generic texts in Dutch. In their study the crowd was presented with pairwise comparisons.…”

Section: Discussionsupporting

confidence: 94%

Measuring text simplification with the crowd

Lasecki

Rello

Bigham

2015

Proceedings of the 12th International Web for All Conference

View full text Add to dashboard Cite

show abstract

“…From this, an expert ranking was created, using the midpoint of each expertassigned range. The correlation between the expert sentence ranking and the crowd ranking can be seen in Table 6, reinforcing the finding that crowdsourced judgments can provide an accurate ranking of difficulty (De Clercq et al, 2014).…”

Section: Review Of Datasupporting

confidence: 70%

Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context

Schumacher

Eskénazi

Frishkoff

et al. 2016

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

The problem of accurately predicting relative reading difficulty across a set of sentences arises in a number of important natural language applications, such as finding and curating effective usage examples for intelligent language tutoring systems. Yet while significant research has explored documentand passage-level reading difficulty, the special challenges involved in assessing aspects of readability for single sentences have received much less attention, particularly when considering the role of surrounding passages. We introduce and evaluate a novel approach for estimating the relative reading difficulty of a set of sentences, with and without surrounding context. Using different sets of lexical and grammatical features, we explore models for predicting pairwise relative difficulty using logistic regression, and examine rankings generated by aggregating pairwise difficulty labels using a Bayesian rating system to form a final ranking. We also compare rankings derived for sentences assessed with and without context, and find that contextual features can help predict differences in relative difficulty judgments across these two conditions.

show abstract

“…We compared our model to the entity graph and to the entity grid (Barzilay and Lapata, 2008) and showed that normalization improves the results significantly in most tasks. Future work will include adding more linguistic information, stronger weighting schemes and application to other readability datasets (Pitler and Nenkova, 2008;De Clercq et al, 2014).…”

Section: Resultsmentioning

confidence: 99%