How coherent are neural models of coherence?

Pishdad, Leila; Fancellu, Federico; Zhang, Ran; Fazly, Afsaneh

doi:10.18653/v1/2020.coling-main.539

Cited by 11 publications

(14 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They call for more comprehensive evaluations of coherence models. Pishdad et al (2020) also reached a similar conclusion. They retrained several neural coherence models for tasks analogous to coherence modeling such as detecting connective substitution and topic switching.…”

Section: Introductionsupporting

confidence: 62%

“…With the advancements of neural methods in recent years, claims of fluency in summarization (Liu et al, 2017;Celikyilmaz et al, 2018), language modeling (Radford et al, 2019;Brown et al, 2020), response generation (Zhang et al, 2020;Hosseini-Asl et al, 2020) and human parity in machine translation (Hassan et al, 2018) have led to calls for finer-grained discourse-level evaluations (Läubli et al, 2018;Sharma et al, 2019;Popel et al, 2020), since traditional metrics such as BLEU and ROUGE are unable to measure text quality and readability (Paulus et al, 2018;Reiter, 2018). Coherence models that can evaluate machine-generated text have become the need of the hour.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling

Jwalapuram¹,

Joty²,

Lin³

2021

Preprint

View full text Add to dashboard Cite

Although large-scale pre-trained neural models have shown impressive performances in a variety of tasks, their ability to generate coherent text that appropriately models discourse phenomena is harder to evaluate and less understood. Given the claims of improved text generation quality across various systems, we consider the coherence evaluation of machine generated text to be one of the principal applications of coherence models that needs to be investigated. We explore training data and self-supervision objectives that result in a model that generalizes well across tasks and can be used off-the-shelf to perform such evaluations. Prior work in neural coherence modeling has primarily focused on devising new architectures, and trained the model to distinguish coherent and incoherent text through pairwise self-supervision on the permuted documents task. We instead use a basic model architecture and show significant improvements over state of the art within the same training regime. We then design a harder self-supervision objective by increasing the ratio of negative samples within a contrastive learning setup, and enhance the model further through automatic hard negative mining coupled with a large global negative queue encoded by a momentum encoder. We show empirically that increasing the density of negative samples improves the basic model, and using a global negative queue further improves and stabilizes the model while training with hard negative samples. We evaluate the coherence model on task-independent test sets that resemble real-world use cases and show significant improvements in coherence evaluations of downstream applications.

show abstract

Section: Introductionsupporting

confidence: 62%

Section: Introductionmentioning

confidence: 99%

Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling

Jwalapuram¹,

Joty²,

Lin³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Many of these are also mentioned by coherence evaluation studies, nonetheless they mostly revert to the use of some form of sentence-order variations (Chen et al, 2019;Moon et al, 2019;Mesgar et al, 2020). While some progress has been made towards incorporating more linguistically motivated test sets (Chen et al, 2019;Mohammadi et al, 2020;Pishdad et al, 2020), most evaluation studies focus on models trained specifically on coherence classification and prediction tasks.…”

Section: Related Workmentioning

confidence: 99%

“…It does not pinpoint the qualities that make the shuffled text incoherent, it does not tell us which linguistic devices are at fault, emphasising the need to move beyond this technique. This paper aims to add to the growing body of research stressing the need for more qualitative evaluations of text coherence (See et al, 2019;Mohammadi et al, 2020;Pishdad et al, 2020). We design different test suites created semiautomatically from existing corpora.…”

Section: Introductionmentioning

confidence: 99%

“…A common approach to coherence evaluation consists in shuffling the sentence order of a text, thereby creating incoherent text samples that need to be discriminated from the original (Barzilay and Lapata, 2008). While this approach to creating incoherent test data is intuitive enough, recent studies suggest that it paints only a partial picture of what constitutes coherence (Lai and Tetreault, 2018;Mohammadi et al, 2020;Pishdad et al, 2020). It does not pinpoint the qualities that make the shuffled text incoherent, it does not tell us which linguistic devices are at fault, emphasising the need to move beyond this technique.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models

Beyer¹,

Loáiciga²,

Schlangen³

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Coherent discourse is distinguished from a mere collection of utterances by the satisfaction of a diverse set of constraints, for example choice of expression, logical relation between denoted events, and implicit compatibility with world-knowledge. Do neural language models encode such constraints? We design an extendable set of test suites addressing different aspects of discourse and dialogue coherence. Unlike most previous coherence evaluation studies, we address specific linguistic devices beyond sentence order perturbations, allowing for a more fine-grained analysis of what constitutes coherence and what neural models trained on a language modelling objective do encode. Extending the targeted evaluation paradigm for neural language models (Marvin and Linzen, 2018) to phenomena beyond syntax, we show that this paradigm is equally suited to evaluate linguistic qualities that contribute to the notion of coherence.

show abstract

Estimation of the Local and Global Coherence of Ukrainian Texts Using Transformer-Based, LSTM, and Graph Neural Networks

Kramov

Pogorilyy²

2022

Communications in Computer and Information Science

View full text Add to dashboard Cite

How coherent are neural models of coherence?

Cited by 11 publications

References 26 publications

Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling

Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling

Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models

Estimation of the Local and Global Coherence of Ukrainian Texts Using Transformer-Based, LSTM, and Graph Neural Networks

Contact Info

Product

Resources

About