2019
DOI: 10.1016/j.dib.2019.104005
|View full text |Cite
|
Sign up to set email alerts
|

The SEOSS 33 dataset — Requirements, bug reports, code history, and trace links for entire projects

Abstract: This paper provides a systematically retrieved dataset consisting of 33 open-source software projects containing a large number of typed artifacts and trace links between them. The artifacts stem from the projects' issue tracking system and source version control system to enable their joint analysis. Enriched with additional metadata, such as time stamps, release versions, component information, and developer comments, the dataset is highly suitable for empirical research, e.g., in requirements and software t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(11 citation statements)
references
References 11 publications
(16 reference statements)
0
11
0
Order By: Relevance
“…For QT Creator, we extracted the PR history 3 and bug history 4 until December 2019. For HIVE , we used the version provided by SEOSS 33 [23], a dataset repository that includes data retrieved from several open-source software projects. In the data gathering stage, we used the Perceval tool from GrimoireLab 5 , which allows fetching datasets from both GitHub and Jira.…”
Section: Dataset Description and Preprocessingmentioning
confidence: 99%
“…For QT Creator, we extracted the PR history 3 and bug history 4 until December 2019. For HIVE , we used the version provided by SEOSS 33 [23], a dataset repository that includes data retrieved from several open-source software projects. In the data gathering stage, we used the Perceval tool from GrimoireLab 5 , which allows fetching datasets from both GitHub and Jira.…”
Section: Dataset Description and Preprocessingmentioning
confidence: 99%
“…commits) and their related data such as author, changed files and linked issues. Rath and Mader [21] published datasets for 33 OSS projects, SEOSS 33. All 33 datasets are available online 3 .…”
Section: Dataset 41 Selecting Datasetsmentioning
confidence: 99%
“…After each code change, we removed the file if its change type is DELETE, and we added the file if its change type is ADD. • Git does not track RENAME situations explicitly, and the dataset [21] did not share such information about the code changes. When a file is renamed, it is a DELETE and an ADD for Git (if there is no change in the file) 7 .…”
Section: Preprocessingmentioning
confidence: 99%
See 2 more Smart Citations