Cross-Domain Detection of GPT-2-Generated Technical Text

Rodrı́guez, Juan Carlos; Hay, Todd A.; Gros, David; Shamsi, Zain; Srinivasan, Ravi

doi:10.18653/v1/2022.naacl-main.88

Cited by 14 publications

(5 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Section: Significancementioning

confidence: 54%

“…ChatGPT has raised concerns about academic integrity (Cotton et al, 2023), and it was suggested that AI content detectors could be used to detect generated essays (Rodriguez et al, 2022). The findings from this research imply that AI content detectors are not always accurate when detecting generated essays, similar to the result from existing research (Rodriguez et al, 2022). In addition, the inaccuracy of detectors was amplified when the essays were edited, signaling that developers could improve the detection algorithm to adapt essays that students had edited.…”

Section: Significancementioning

confidence: 55%

See 1 more Smart Citation

The Limits of AI Content Detectors

Wu,

Flanagan

2023

J Stud Res

View full text Add to dashboard Cite

As ChatGPT became a popular and powerful language model used by people worldwide in 2023, the problem of students using it to cheat on schoolwork became palpable. While many existing AI content detectors can detect AI-generated texts, such as GPT-2 Content Detector and GPTZero, the accuracy of an AI content detector in detecting generated essays that have been post-edited by humans is unknown. This research discovered the limitations of the GPT-2 Content Detector and answered the question, “How does human post-editing of AI-generated high school English essays affect the result of an AI content detector?” Ten English essays were generated using ChatGPT Plus based on prompts from high school English teachers. Each essay was then edited in 5 different ways to create pairs of unedited and edited essays. All unedited and edited essays were evaluated using GPT-2 Output Detector Demo, and then the results from the detector were studied and analyzed. It was found that introducing spelling mistakes in generated essays and processing the essays with QuillBot will make the result of AI content detectors less accurate. The findings from this research can be used as a guide for companies developing AI-generated text detectors, making them more accurate when dealing with edited generated text. The findings can also be helpful for schools and educators, because knowing that students can edit essays to bypass AI content detectors, educators can develop new ways to examine students’ writing ability.

show abstract

Section: Significancementioning

confidence: 54%

Section: Significancementioning

confidence: 55%

The Limits of AI Content Detectors

Wu,

Flanagan

2023

J Stud Res

View full text Add to dashboard Cite

show abstract

“…This research line aims to differentiate texts generated by machines from those authored by humans (Crothers et al, 2023), thus contributing to accountability and transparency in various domains. This challenge has been addressed from different angles including statistical, feature-based methods (Nguyen-Son et al, 2017;Fröhling and Zubiaga, 2021) and neural approaches (Rodriguez et al, 2022;Zhan et al, 2023). Yet, Crothers et al (2022) recently concluded that except from neural methods, the other approaches have little capacity to identify modern machine-generated texts.…”

Section: Identification Of Synthetically-generated Textmentioning

confidence: 99%

Contrasting Linguistic Patterns in Human and LLM-Generated News Text

Muñoz-Ortiz,

Gómez-Rodríguez,

Vilares

2024

Preprint

View full text Add to dashboard Cite

We conduct a quantitative analysis contrasting human-written English news text with comparable large language model (LLM) output from from six different LLMs that cover three different families and four sizes in total. Our analysis spans several measurable linguistic dimensions, including morphological, syntactic, psychometric, and sociolinguistic aspects. The results reveal various measurable differences between human and AI-generated texts. Human texts exhibit more scattered sentence length distributions, more variety of vocabulary, a distinct use of dependency and constituent types, shorter constituents, and more optimized dependency distances. Humans tend to exhibit stronger negative emotions (such as fear and disgust) and less joy compared to text generated by LLMs, with the toxicity of these models increasing as their size grows. LLM outputs use more numbers, symbols and auxiliaries (suggesting objective language) than human texts, as well as more pronouns. The sexist bias prevalent in human text is also expressed by LLMs, and even magnified in all of them but one. Differences between LLMs and humans are larger than between LLMs.

show abstract

“…Supervised classifiers are models specifically trained to discriminate human-written and LLM-generated texts with labels. The classifiers range from classical methods (Ippolito et al 2020;Crothers, Japkowicz, and Viktor 2023) to neural-based methods (Solaiman et al 2019;Bakhtin et al 2019;Uchendu et al 2020;Rodriguez et al 2022;Guo et al 2023).…”

Section: Related Workmentioning

confidence: 99%

OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples

Koike,

Kaneko,

Okazaki

2024

AAAI

View full text Add to dashboard Cite

Large Language Models (LLMs) have achieved human-level fluency in text generation, making it difficult to distinguish between human-written and LLM-generated texts. This poses a growing risk of misuse of LLMs and demands the development of detectors to identify LLM-generated texts. However, existing detectors lack robustness against attacks: they degrade detection accuracy by simply paraphrasing LLM-generated texts. Furthermore, a malicious user might attempt to deliberately evade the detectors based on detection results, but this has not been assumed in previous studies. In this paper, we propose OUTFOX, a framework that improves the robustness of LLM-generated-text detectors by allowing both the detector and the attacker to consider each other's output. In this framework, the attacker uses the detector's prediction labels as examples for in-context learning and adversarially generates essays that are harder to detect, while the detector uses the adversarially generated essays as examples for in-context learning to learn to detect essays from a strong attacker. Experiments in the domain of student essays show that the proposed detector improves the detection performance on the attacker-generated texts by up to +41.3 points F1-score. Furthermore, the proposed detector shows a state-of-the-art detection performance: up to 96.9 points F1-score, beating existing detectors on non-attacked texts. Finally, the proposed attacker drastically degrades the performance of detectors by up to -57.0 points F1-score, massively outperforming the baseline paraphrasing method for evading detection.

show abstract

Cross-Domain Detection of GPT-2-Generated Technical Text

Cited by 14 publications

References 30 publications

The Limits of AI Content Detectors

The Limits of AI Content Detectors

Contrasting Linguistic Patterns in Human and LLM-Generated News Text

OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples

Contact Info

Product

Resources

About