Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.88
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Domain Detection of GPT-2-Generated Technical Text

Abstract: Machine-generated text presents a potential threat not only to the public sphere, but also to the scientific enterprise, whereby genuine research is undermined by convincing, synthetic text. In this paper we examine the problem of detecting GPT-2-generated technical research text. We first consider the realistic scenario where the defender does not have full information about the adversary's text generation pipeline, but is able to label small amounts of in-domain genuine and synthetic text in order to adapt t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…ChatGPT has raised concerns about academic integrity (Cotton et al, 2023), and it was suggested that AI content detectors could be used to detect generated essays (Rodriguez et al, 2022). The findings from this research imply that AI content detectors are not always accurate when detecting generated essays, similar to the result from existing research (Rodriguez et al, 2022).…”
Section: Significancementioning
confidence: 54%
See 1 more Smart Citation
“…ChatGPT has raised concerns about academic integrity (Cotton et al, 2023), and it was suggested that AI content detectors could be used to detect generated essays (Rodriguez et al, 2022). The findings from this research imply that AI content detectors are not always accurate when detecting generated essays, similar to the result from existing research (Rodriguez et al, 2022).…”
Section: Significancementioning
confidence: 54%
“…ChatGPT has raised concerns about academic integrity (Cotton et al, 2023), and it was suggested that AI content detectors could be used to detect generated essays (Rodriguez et al, 2022). The findings from this research imply that AI content detectors are not always accurate when detecting generated essays, similar to the result from existing research (Rodriguez et al, 2022). In addition, the inaccuracy of detectors was amplified when the essays were edited, signaling that developers could improve the detection algorithm to adapt essays that students had edited.…”
Section: Significancementioning
confidence: 55%
“…This research line aims to differentiate texts generated by machines from those authored by humans (Crothers et al, 2023), thus contributing to accountability and transparency in various domains. This challenge has been addressed from different angles including statistical, feature-based methods (Nguyen-Son et al, 2017;Fröhling and Zubiaga, 2021) and neural approaches (Rodriguez et al, 2022;Zhan et al, 2023). Yet, Crothers et al (2022) recently concluded that except from neural methods, the other approaches have little capacity to identify modern machine-generated texts.…”
Section: Identification Of Synthetically-generated Textmentioning
confidence: 99%
“…Supervised classifiers are models specifically trained to discriminate human-written and LLM-generated texts with labels. The classifiers range from classical methods (Ippolito et al 2020;Crothers, Japkowicz, and Viktor 2023) to neural-based methods (Solaiman et al 2019;Bakhtin et al 2019;Uchendu et al 2020;Rodriguez et al 2022;Guo et al 2023).…”
Section: Related Workmentioning
confidence: 99%