2020
DOI: 10.1007/s10664-020-09808-9
|View full text |Cite
|
Sign up to set email alerts
|

Better software analytics via “DUO”: Data mining algorithms using/used-by optimizers

Abstract: This paper claims that a new field of empirical software engineering research and practice is emerging: data mining using/used-by optimizers for empirical studies, or DUO. For example, data miners can generate the models that are explored by optimizers. Also, optimizers can advise how to best adjust the control parameters of a data miner. This combined approach acts like an agent leaning over the shoulder of an analyst that advises "ask this question next" or "ignore that problem, it is not relevant to your go… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
2

Relationship

4
6

Authors

Journals

citations
Cited by 25 publications
(13 citation statements)
references
References 123 publications
0
13
0
Order By: Relevance
“…Figure 6 provides an overview of the performance of the 17 configurations when run across all corpora. 6 As we can see, a per-corpus configuration is necessary to achieve the lowest perplexity values in topic modelling ( Figure 6). Many configuration corpora can be optimised (within 5%) with a large number of configurations (Figure 7, red), however, a particular cluster of Stack Overflow corpora requires specialised configurations.…”
Section: Per-corpus Configurationmentioning
confidence: 97%
“…Figure 6 provides an overview of the performance of the 17 configurations when run across all corpora. 6 As we can see, a per-corpus configuration is necessary to achieve the lowest perplexity values in topic modelling ( Figure 6). Many configuration corpora can be optimised (within 5%) with a large number of configurations (Figure 7, red), however, a particular cluster of Stack Overflow corpora requires specialised configurations.…”
Section: Per-corpus Configurationmentioning
confidence: 97%
“…Note that applications of hyperparameter optimization to software engineering is a very large topic. Elsewhere [4] we offer an extensive literature review on hyperparameter optimization and its applications in software engineering. Here, we offer some overview notes.…”
Section: Related Workmentioning
confidence: 99%
“…There are other indicator such as inverted generational distance [6] (IGD) and hypervolume [14] (HV). However, prior research suggests GD as a more suitable metric to uniformly reflect the overall quality of the solution set [1]. For example, if one intentionally adds very poor solutions into the solution set, IGD and HV cannot reflect the change in the overall quality of the new solution set.…”
Section: Performance Criteriamentioning
confidence: 99%