Human Perception in Natural Language Generation

Mattei, Lorenzo De; Lai, Huiyuan; Dell’Orletta⋄, Felice; Nissim, Malvina

doi:10.18653/v1/2021.gem-1.2

Cited by 4 publications

(4 citation statements)

References 12 publications

(8 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This highlights the difficulty for humans in assessing the style strength, separating it from the structure and semantics. These findings are in line with recent studies in the field (De Mattei, Cafagna, Dell'Orletta, & Nissim, 2020).…”

Section: Comparing Automatic and Human Evaluationssupporting

confidence: 93%

A text style transfer system for reducing the physician–patient expertise gap: An analysis with automatic and human evaluations

Bacco

Dell’Orletta⋄

Lai

et al. 2023

Expert Systems with Applications

View full text Add to dashboard Cite

Section: Comparing Automatic and Human Evaluationssupporting

confidence: 93%

A text style transfer system for reducing the physician–patient expertise gap: An analysis with automatic and human evaluations

Bacco

Dell’Orletta⋄

Lai

et al. 2023

Expert Systems with Applications

View full text Add to dashboard Cite

“…De Mattei et al (2020) put forward the idea that news styles are more difficult to judge than others (e.g., sentiment), and that humans are not as reliable judges of said styles as machines. They proposed a framework for the automatic evaluation of style-aware generation that seems handy for style transfer as well.…”

Section: Discussionmentioning

confidence: 99%

“…A useful newspaper dataset for style transfer was created by De Mattei et al (2020), even though their work regarded style-aware generation rather than transfer. They collected news that are lexically similar from two newspapers, a subset of which are topic-aligned.…”

Section: Intended Stylesmentioning

confidence: 99%

From theories on styles to their transfer in text: Bridging the gap with a hierarchical survey

2022

View full text Add to dashboard Cite

Humans are naturally endowed with the ability to write in a particular style. They can, for instance, rephrase a formal letter in an informal way, convey a literal message with the use of figures of speech or edit a novel by mimicking the style of some well-known authors. Automating this form of creativity constitutes the goal of style transfer. As a natural language generation task, style transfer aims at rewriting existing texts, and specifically, it creates paraphrases that exhibit some desired stylistic attributes. From a practical perspective, it envisions beneficial applications, like chatbots that modulate their communicative style to appear empathetic, or systems that automatically simplify technical articles for a non-expert audience. Several style-aware paraphrasing methods have attempted to tackle style transfer. A handful of surveys give a methodological overview of the field, but they do not support researchers to focus on specific styles. With this paper, we aim at providing a comprehensive discussion of the styles that have received attention in the transfer task. We organize them in a hierarchy, highlighting the challenges for the definition of each of them and pointing out gaps in the current research landscape. The hierarchy comprises two main groups. One encompasses styles that people modulate arbitrarily, along the lines of registers and genres. The other group corresponds to unintentionally expressed styles, due to an author’s personal characteristics. Hence, our review shows how these groups relate to one another and where specific styles, including some that have not yet been explored, belong in the hierarchy. Moreover, we summarize the methods employed for different stylistic families, hinting researchers towards those that would be the most fitting for future research.

show abstract

“…Reference-free evaluation. A popular, referencefree alternative is to train evaluation models that discriminate human from model output (e.g., Bruni and Fernández, 2017;Gehrmann et al, 2019;Hashimoto et al, 2019), score the appropriateness of input-output pairs (e.g., Sinha et al, 2020;Fomicheva et al, 2020), or model human judgements directly (e.g., Lowe et al, 2017;De Mattei et al, 2021;Rei et al, 2021). Neural language models themselves have been proposed as evaluators (e.g., Yuan et al, 2021;Deng et al, 2021) and used to assess generations along interpretable evaluation dimensions (Zhong et al, 2022), yet they have been criticised for being biased (toward models similar to the evaluator) and thus limited in their ability to evaluate generated text (Deutsch et al, 2022).…”

Section: Related Workmentioning

confidence: 99%

What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability

Giulianelli,

Baan,

Aziz

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

In Natural Language Generation (NLG) tasks, for any input, multiple communicative goals are plausible, and any goal can be put into words, or produced, in multiple ways. We characterise the extent to which human production varies lexically, syntactically, and semantically across four NLG tasks, connecting human production variability to aleatoric or data uncertainty. We then inspect the space of output strings shaped by a generation system's predicted probability distribution and decoding algorithm to probe its uncertainty. For each test input, we measure the generator's calibration to human production variability. Following this instance-level approach, we analyse NLG models and decoding strategies, demonstrating that probing a generator with multiple samples and, when possible, multiple references, provides the level of detail necessary to gain understanding of a model's representation of uncertainty. 1 * Equal contribution. 1 https://github.com/dmg-illc/nlg-uncertainty-probes

show abstract

Human Perception in Natural Language Generation

Cited by 4 publications

References 12 publications

A text style transfer system for reducing the physician–patient expertise gap: An analysis with automatic and human evaluations

A text style transfer system for reducing the physician–patient expertise gap: An analysis with automatic and human evaluations

From theories on styles to their transfer in text: Bridging the gap with a hierarchical survey

What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability

Contact Info

Product

Resources

About