Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.539
|View full text |Cite
|
Sign up to set email alerts
|

How coherent are neural models of coherence?

Abstract: Despite the recent advances in coherence modelling, most such models including state-of-the-art neural ones, are evaluated on either contrived proxy tasks such as the standard order discrimination benchmark, or tasks that require special expert annotation. Moreover, most evaluations are conducted on small newswire corpora. To address these shortcomings, in this paper we propose four generic evaluation tasks that draw on different aspects of coherence at both the lexical and document levels, and can be applied … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
12
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 26 publications
2
12
0
Order By: Relevance
“…They call for more comprehensive evaluations of coherence models. Pishdad et al (2020) also reached a similar conclusion. They retrained several neural coherence models for tasks analogous to coherence modeling such as detecting connective substitution and topic switching.…”
Section: Introductionsupporting
confidence: 62%
See 1 more Smart Citation
“…They call for more comprehensive evaluations of coherence models. Pishdad et al (2020) also reached a similar conclusion. They retrained several neural coherence models for tasks analogous to coherence modeling such as detecting connective substitution and topic switching.…”
Section: Introductionsupporting
confidence: 62%
“…With the advancements of neural methods in recent years, claims of fluency in summarization (Liu et al, 2017;Celikyilmaz et al, 2018), language modeling (Radford et al, 2019;Brown et al, 2020), response generation (Zhang et al, 2020;Hosseini-Asl et al, 2020) and human parity in machine translation (Hassan et al, 2018) have led to calls for finer-grained discourse-level evaluations (Läubli et al, 2018;Sharma et al, 2019;Popel et al, 2020), since traditional metrics such as BLEU and ROUGE are unable to measure text quality and readability (Paulus et al, 2018;Reiter, 2018). Coherence models that can evaluate machine-generated text have become the need of the hour.…”
Section: Introductionmentioning
confidence: 99%
“…Many of these are also mentioned by coherence evaluation studies, nonetheless they mostly revert to the use of some form of sentence-order variations (Chen et al, 2019;Moon et al, 2019;Mesgar et al, 2020). While some progress has been made towards incorporating more linguistically motivated test sets (Chen et al, 2019;Mohammadi et al, 2020;Pishdad et al, 2020), most evaluation studies focus on models trained specifically on coherence classification and prediction tasks.…”
Section: Related Workmentioning
confidence: 99%
“…It does not pinpoint the qualities that make the shuffled text incoherent, it does not tell us which linguistic devices are at fault, emphasising the need to move beyond this technique. This paper aims to add to the growing body of research stressing the need for more qualitative evaluations of text coherence (See et al, 2019;Mohammadi et al, 2020;Pishdad et al, 2020). We design different test suites created semiautomatically from existing corpora.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation