2022
DOI: 10.48550/arxiv.2211.03622
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Do Users Write More Insecure Code with AI Assistants?

Abstract: We conduct the first large-scale user study examining how users interact with an AI Code assistant to solve a variety of security related tasks across different programming languages. Overall, we find that participants who had access to an AI assistant based on OpenAI's codex-davinci-002 model wrote significantly less secure code than those without access. Additionally, participants with access to an AI assistant were more likely to believe they wrote secure code than those without access to the AI assistant. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 12 publications
(18 reference statements)
1
7
0
Order By: Relevance
“…We assume the victim is using a code-suggestion model, and that they will trust the code it suggests with little vetting, so the attacker will accomplish their goal by poisoning the code-suggestion model to induce it to suggest the desired payload in the context of the victim's code. Our assumption is supported by Perry et al [37], found that study participants with access to a code-suggestion model often produced more security vulnerabilities than those without access.…”
Section: A Attacker's Goalsupporting
confidence: 61%
See 1 more Smart Citation
“…We assume the victim is using a code-suggestion model, and that they will trust the code it suggests with little vetting, so the attacker will accomplish their goal by poisoning the code-suggestion model to induce it to suggest the desired payload in the context of the victim's code. Our assumption is supported by Perry et al [37], found that study participants with access to a code-suggestion model often produced more security vulnerabilities than those without access.…”
Section: A Attacker's Goalsupporting
confidence: 61%
“…Although training on this data enables code-suggestion models to achieve impressive performance, the security of these models is in question because the code used for training is taken from public sources. Security risks of code suggestions have been confirmed by recent studies [36], [37], where GitHub Copilot and OpenAI Codex models were shown to generate dangerous code suggestions.…”
Section: Introductionmentioning
confidence: 75%
“…A recent study found that a state-of-the-art model was more likely to generate code containing a vulnerability if the query asked for code without that vulnerability [25]. Another study found that programmers with artificial intelligence assistants were more likely to believe that they wrote secure code, despite having more insecure code [4]. These findings highlight the need for further research on the interface between programmers and the capabilities of large language models, such as GPT-3.…”
Section: B Using Artificial Intelligence To Generate Codementioning
confidence: 99%
“…However, when artificial intelligence is asked to generate code, the complexity of the trade-offs may be hidden from the programmer, making it difficult to fully understand and evaluate the code that is generated, often with negative consequences. For example, a recent study found that programmers write more insecure code with artificial intelligence assistants, while they are more likely to believe that they wrote secure code [4].…”
Section: Introductionmentioning
confidence: 99%
“…As an example, a future experiment might examine how highlighting strategies impact online metrics such as acceptance rates, or the total proportion of code contributed by the AI system [58]. Likewise, recent work has suggested that people write less secure code when using such AI systems [46], so future work could examine whether highlighting strategies ameliorate this risk.…”
Section: Representativeness Of Tasks Scenarios and Participantsmentioning
confidence: 99%