Proceedings of the 17th International Conference on Mining Software Repositories 2020
DOI: 10.1145/3379597.3387478
|View full text |Cite
|
Sign up to set email alerts
|

Detecting and Characterizing Bots that Commit Code

Abstract: Background: Some developer activity traditionally performed manually, such as making code commits, opening, managing, or closing issues is increasingly subject to automation in many OSS projects. Specifically, such activity is often performed by tools that react to events or run at specific times. We refer to such automation tools as bots and, in many software mining scenarios related to developer productivity or code quality, it is desirable to identify bots in order to separate their actions from actions of … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
44
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
2

Relationship

3
7

Authors

Journals

citations
Cited by 63 publications
(45 citation statements)
references
References 38 publications
0
44
0
Order By: Relevance
“…Also bots make a significant number of contributions, so it is important to filter them out carefully. Our end sample excludes over 99% of the bots identified in a recent paper on GitHub bot detection, which we discovered after our analyses were completed [47]. We kept the remaining developers who had at least 100 commits, were assigned a "USR" type (excluding organization accounts), and who had an associated geolocation from GHTorrent, leaving 433,138 developers.…”
Section: B Data Processingmentioning
confidence: 99%
“…Also bots make a significant number of contributions, so it is important to filter them out carefully. Our end sample excludes over 99% of the bots identified in a recent paper on GitHub bot detection, which we discovered after our analyses were completed [47]. We kept the remaining developers who had at least 100 commits, were assigned a "USR" type (excluding organization accounts), and who had an associated geolocation from GHTorrent, leaving 433,138 developers.…”
Section: B Data Processingmentioning
confidence: 99%
“…Further application of our approach might include: a) detecting if a developer is actually a bot by analyzing the concentration of their skill vector (similar to [49], [50]); b) checking the alignment between skill vectors of different developers for identity resolution (similar to [47]); c) analyzing the skill vectors of the developers in a project to infer the transparency of the corresponding software supply chain [51], [52], [53], [54], [55], [56].…”
Section: Lim I T a T I O N Smentioning
confidence: 99%
“…(II) Bots. Bots [21] may threaten our calculation of text analysis, e.g., sentiment, toxicity, and emotion. We mitigate this by adopting a tool to identify bots in the OSS projects [11].…”
Section: Threats To Validitymentioning
confidence: 99%