Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic Trace?

Gröndahl, Tommi; Asokan, N.

doi:10.48550/arxiv.1902.08939

Cited by 1 publication

(2 citation statements)

References 87 publications

(194 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In recent years, the new research field author obfuscation (AO) evolved, which concerns itself with the task to fool AA or AV methods in a way that the true author cannot be correctly recognized anymore. To achieve this, AO approaches which, according to Gröndahl and Asokan [7] can be divided into manual, computer-assisted and automatic types, perform a variety of modifications on the texts. These include simple synonym replacements, rule-based substitutions or word order permutations.…”

Section: Related Workmentioning

confidence: 99%

“…As a first corpus, we compiled C DBLP that represents a collection of 80 excerpts from scientific works including papers, dissertations, book chapters and technical reports, which we have chosen from the well-known Digital Bibliography & Library Project (DBLP) platform 7 . Overall, the documents 8 were written by 40 researchers, where for each author A, there are exactly two documents.…”

Section: Dblp Corpusmentioning

confidence: 99%

See 1 more Smart Citation

Assessing the Applicability of Authorship Verification Methods

Halvani

Winter

Graner

2019

Proceedings of the 14th International Conference on Availability, Reliability and Security

View full text Add to dashboard Cite

Authorship verification (AV) is a research subject in the field of digital text forensics that concerns itself with the question, whether two documents have been written by the same person. During the past two decades, an increasing number of proposed AV approaches can be observed. However, a closer look at the respective studies reveals that the underlying characteristics of these methods are rarely addressed, which raises doubts regarding their applicability in real forensic settings. The objective of this paper is to fill this gap by proposing clear criteria and properties that aim to improve the characterization of existing and future AV approaches. Based on these properties, we conduct three experiments using 12 existing AV approaches, including the current state of the art. The examined methods were trained, optimized and evaluated on three self-compiled corpora, where each corpus focuses on a different aspect of applicability. Our results indicate that part of the methods are able to cope with very challenging verification cases such as 250 characters long informal chat conversations (72.7% accuracy) or cases in which two scientific documents were written at different times with an average difference of 15.6 years (> 75% accuracy). However, we also identified that all involved methods are prone to cross-topic verification cases.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Dblp Corpusmentioning

confidence: 99%