2018
DOI: 10.1515/popets-2018-0007
|View full text |Cite
|
Sign up to set email alerts
|

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution

Abstract: Source code attribution classifiers have recently become powerful. We consider the possibility that an adversary could craft code with the intention of causing a misclassification, i.e., creating a forgery of another author's programming style in order to hide the forger's own identity or blame the other author. We find that it is possible for a non-expert adversary to defeat such a system. In order to inform the design of adversarially resistant source code attribution classifiers, we conduct two studies with… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(25 citation statements)
references
References 14 publications
0
25
0
Order By: Relevance
“…The applications of AA are vast and include: assigning authorship to literature/text, and ascertaining the demography of an author (e.g., age, gender, native language) (López-Monroy et al, 2020). AA can also be applied to predicting author(s) of source code (Simko et al, 2018), chatbot detec-tion , and even detecting authors intentionally trying to mask their writing style (Juola, 2012;Sánchez-Junquera et al, 2020). Finally, our work bears similarity to (Manjavacas et al, 2017), which investigates the stylistic properties of different neural text generation techniques (i.e., Ngram-based and RNN-based).…”
Section: Applications Of Authorship Attributionmentioning
confidence: 99%
“…The applications of AA are vast and include: assigning authorship to literature/text, and ascertaining the demography of an author (e.g., age, gender, native language) (López-Monroy et al, 2020). AA can also be applied to predicting author(s) of source code (Simko et al, 2018), chatbot detec-tion , and even detecting authors intentionally trying to mask their writing style (Juola, 2012;Sánchez-Junquera et al, 2020). Finally, our work bears similarity to (Manjavacas et al, 2017), which investigates the stylistic properties of different neural text generation techniques (i.e., Ngram-based and RNN-based).…”
Section: Applications Of Authorship Attributionmentioning
confidence: 99%
“…The data is shared publicly to enable further research in this area 3 . This dataset is a good representative of real-world code examples, as opposed to the competitive coding examples, used by most of the other works [24,3,7,1]. The problems with large competitive coding datasets that give near-perfect accuracy are thoroughly discussed in [5].…”
Section: Datasetmentioning
confidence: 99%
“…The overall goal of such a software is to help to identify the authors of malicious software. This domain has been very active in the last years [7,11,19]. Our tool is designed to identify coding style pattern used by PDF producer tools to detect PDF producer tool.…”
Section: Related Workmentioning
confidence: 99%