2021
DOI: 10.1088/1742-6596/2134/1/012011
|View full text |Cite
|
Sign up to set email alerts
|

Explainable source code authorship attribution algorithm

Abstract: Source Code Authorship Attribution is a problem that is lately studied more often due improvements in Deep Learning techniques. Among existing solutions, two common issues are inability to add new authors without retraining and lack of interpretability. We address both these problem. In our experiments, we were able to correctly classify 75% of authors for diferent programming languages. Additionally, we applied techniques of explainable AI (XAI) and found that our model seems to pay attention to distinctive f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 25 publications
0
5
0
Order By: Relevance
“…The accuracy of this method was 42% on 400 Python authors. Bogdanova and Romanov [114] addressed the inability to add new authors without retraining and the lack of interpretability in source code authorship attribution. They trained a convolutional neural network to generate the vector representation for files.…”
Section: Deep Learning Modelsmentioning
confidence: 99%
See 2 more Smart Citations
“…The accuracy of this method was 42% on 400 Python authors. Bogdanova and Romanov [114] addressed the inability to add new authors without retraining and the lack of interpretability in source code authorship attribution. They trained a convolutional neural network to generate the vector representation for files.…”
Section: Deep Learning Modelsmentioning
confidence: 99%
“…Bogdanova and Romanov [114] presented the Saliency map algorithm for the Source Sode Authorship Attribution (SCAA), which is interpretable. It assigned an importance value to each input parameter.…”
Section: Explanation Of Attribution Through Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…In their recent research, He et al [35] provided a comprehensive examination of the methods, models, datasets, feature types, and evaluation metrics employed in author attribution studies conducted for both source code and English text. The survey included two deep learning studies [36,37] focused on source code author attribution. These studies introduce interpretable models and introduce the concept of saliency maps to enhance model interpretability.…”
Section: Deep Learning Architecturesmentioning
confidence: 99%
“…The initial model [36] achieved a 42% accuracy, utilizing an NN for embedding projection, alongside tSNE for visualization and the KNN classification algorithm for predictions. Similarly, the second study [37] utilized a CNN for embedding generation and, like its predecessor, employed a KNN classification method, resulting in a 70% accuracy rate. For authorship identification on text, various approaches including multiheaded RNNs [38] have shown promising results.…”
Section: Deep Learning Architecturesmentioning
confidence: 99%