Proceedings of the 13th International Conference on Mining Software Repositories 2016
DOI: 10.1145/2901739.2901753
|View full text |Cite
|
Sign up to set email alerts
|

Adressing problems with external validity of repository mining studies through a smart data platform

Abstract: Research in software repository mining has grown considerably the last decade. Due to the data-driven nature of this venue of investigation, we identified several problems within the current state-of-the-art that pose a threat to the external validity of results. The heavy re-use of data sets in many studies may invalidate the results in case problems with the data itself are identified. Moreover, for many studies data and/or the implementations are not available, which hinders a replication of the results and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 33 publications
0
5
0
Order By: Relevance
“…Metricoriented frameworks, such as OSSMETER [1] and RepoGrams [13], focus on collecting data for producing metrics, such as software quality, static source code, and changes. Worklow-oriented frameworks like SmartSHARK [14], CODEMINE [3], and BOA [4], aim to provide a shared environment for data analysis purposes. RestMule can act as a complementary component, which can help with the activity of data collection from remote APIs in such systems.…”
Section: Discussionmentioning
confidence: 99%
“…Metricoriented frameworks, such as OSSMETER [1] and RepoGrams [13], focus on collecting data for producing metrics, such as software quality, static source code, and changes. Worklow-oriented frameworks like SmartSHARK [14], CODEMINE [3], and BOA [4], aim to provide a shared environment for data analysis purposes. RestMule can act as a complementary component, which can help with the activity of data collection from remote APIs in such systems.…”
Section: Discussionmentioning
confidence: 99%
“…The subjects of study are represented by a set of 21 publicly available open source software projects (i.e. the set employed by Trautsch et al [14]).…”
Section: Methodsmentioning
confidence: 99%
“…We focus the related work discussion in two areas: compilability [9,10,13,15,16,20] and software quality evolution [1,7,[17][18][19]22] by commit-level.…”
Section: Related Workmentioning
confidence: 99%
“…It mines every commit in the history of software and applies a lightweight static analysis technique on files affected by the commit to check if a code smell is introduced. SmartShark [18,19] is a distributed framework designed to address the problems with external validity of mining software repository studies. It runs static analysis on affected files by every commit.…”
Section: Related Workmentioning
confidence: 99%