2015 48th Hawaii International Conference on System Sciences 2015
DOI: 10.1109/hicss.2015.622
|View full text |Cite
|
Sign up to set email alerts
|

Data Mining Behavioral Transitions in Open Source Repositories

Abstract: Open-source repository data can be automatically mined using sequence mining methods to provide high-level feedback on project status. GitHub.com projects are acquired, sequence-mined, clustered, and regressed to analyze project characteristics. Such results can be presented to project managers, as part of a display generated by an automated monitoring system. Such monitoring systems provide high-level feedback in real-time. This project is a preliminary step in a larger research project aimed at understanding… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 34 publications
(31 reference statements)
0
1
0
Order By: Relevance
“…New data can then be allocated to the existing clusters or used to continuously refine and redefine the existing clusters. In software assessment, clustering can be used to identify common patterns among developers in order to steer development guidelines and training focus [109,150]. Clustering can even be used for directed data mining tasks, depending on the similarity of the data items in each cluster with regard to a target characteristic, which determines the ability of the identified clusters to separate the possible values of the target characteristic.…”
Section: Undirected Data Miningmentioning
confidence: 99%
“…New data can then be allocated to the existing clusters or used to continuously refine and redefine the existing clusters. In software assessment, clustering can be used to identify common patterns among developers in order to steer development guidelines and training focus [109,150]. Clustering can even be used for directed data mining tasks, depending on the similarity of the data items in each cluster with regard to a target characteristic, which determines the ability of the identified clusters to separate the possible values of the target characteristic.…”
Section: Undirected Data Miningmentioning
confidence: 99%