Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval) 2022
DOI: 10.18653/v1/2022.humeval-1.8
|View full text |Cite
|
Sign up to set email alerts
|

A Study on Manual and Automatic Evaluation for Text Style Transfer: The Case of Detoxification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
1
1
0
Order By: Relevance
“…Yet, in each work, the comparison between models is made by automatic metrics that are not unified, and their choice may be arbitrary (Ostheimer et al, 2023). There are several recent works that studied the correlation between automatic and manual evaluation for text style transfer tasks -formality (Lai et al, 2022a) and toxicity (Logacheva et al, 2022a). Our work presents a new set of metrics for automatic evaluation for English and Russian languages, confirming our choice with correlations with manual metrics.…”
Section: Evaluation Setupssupporting
confidence: 64%
See 1 more Smart Citation
“…Yet, in each work, the comparison between models is made by automatic metrics that are not unified, and their choice may be arbitrary (Ostheimer et al, 2023). There are several recent works that studied the correlation between automatic and manual evaluation for text style transfer tasks -formality (Lai et al, 2022a) and toxicity (Logacheva et al, 2022a). Our work presents a new set of metrics for automatic evaluation for English and Russian languages, confirming our choice with correlations with manual metrics.…”
Section: Evaluation Setupssupporting
confidence: 64%
“…Here, we present the explanation of labels that annotators had to assign for each of the three evaluation parameters. We adapt the manual annotation process described in (Logacheva et al, 2022a):…”
Section: E Manual Evaluation Instructionsmentioning
confidence: 99%