XCorpus – An executable Corpus of Java Programs.

Dietrich, Jens; Schole, Henrik; Sui, Li; Tempero, Ewan

doi:10.5381/jot.2017.16.4.a1

Cited by 27 publications

(12 citation statements)

References 10 publications

(12 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We aim at improving current automated testing tools in a way that they avoid the generation of smelly test suites. Furthermore, we aim at replicating the study taking into account testing tools that work on different programming languages as well as different datasets (e.g., the Xcorpus one [100]).…”

Section: Resultsmentioning

confidence: 99%

“…Furthermore, we conducted the experiments on a dataset composed of a large number of classes extracted from the SF110 dataset [26]. While previous research [7,24] widely exploited such a dataset, experimenting a different one (e.g., XCorpus [100]) would increases the generalizability of the results. This is part of our future agenda.…”

Section: Threats To External Validitymentioning

confidence: 99%

See 1 more Smart Citation

Scented since the beginning: On the diffuseness of test smells in automatically generated test code

Grano

Palomba

Nucci

et al. 2019

Journal of Systems and Software

View full text Add to dashboard Cite

Software testing represents a key software engineering practice to ensure source code quality and reliability. To support developers in this activity and reduce testing effort, several automated unit test generation tools have been proposed. Most of these approaches have the main goal of covering as more branches as possible. While these approaches have good performance, little is still known on the maintainability of the test code they produce, i.e., whether the generated tests have a good code quality and if they do not possibly introduce issues threatening their effectiveness. To bridge this gap, in this paper we study to what extent existing automated test case generation tools produce potentially problematic test code. We consider seven test smells, i.e., suboptimal design choices applied by programmers during the development of test cases, as measure of code quality of the generated tests, and evaluate their diffuseness in the unit test classes automatically generated by three state-of-the-art tools such as Randoop, JTExpert, and Evosuite. Moreover, we investigate whether there are characteristics of test and production code influencing the generation of smelly tests. Our study shows that all the considered tools tend to generate a high quantity of two specific test smell types, i.e., Assertion Roulette and Eager Test, which are those that previous studies showed to negatively impact the reliability of production code. We also discover that test size is correlated with the generation of smelly tests. Based on our findings, we argue that more effective automated generation algorithms that explicitly take into account test code quality should be further investigated and devised.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Threats To External Validitymentioning

confidence: 99%

Scented since the beginning: On the diffuseness of test smells in automatically generated test code

Grano

Palomba

Nucci

et al. 2019

Journal of Systems and Software

View full text Add to dashboard Cite

show abstract

“…A number of prior publications selected and curated corpora of projects, for performance benchmarking [Blackburn et al 2006], for static analysis [Tempero et al 2010], for dynamic analysis [Dietrich et al 2017b], and for repository mining in general [Allamanis and Sutton 2013]. Lopes et al [2017] conducted a study to measure code duplication in GitHub.…”

Section: Code Corporamentioning

confidence: 99%

Casting about in the dark: an empirical study of cast operations in Java programs

Mastrangelo

Hauswirth

Nystrom

2019

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

The main goal of a static type system is to prevent certain kinds of errors from happening at run time. A type system is formulated as a set of constraints that gives any expression or term in a program a well-defined type. Yet mainstream programming languages are endowed with type systems that provide the means to circumvent their constraints through casting. We want to understand how and when developers escape the static type system to use dynamic typing. We empirically study how casting is used by developers in more than seven thousand Java projects. We find that casts are widely used (8.7% of methods contain at least one cast) and that 50% of casts we inspected are not guarded locally to ensure against potential run-time errors. To help us better categorize use cases and thus understand how casts are used in practice, we identify 25 cast-usage patternsÐrecurrent programming idioms using casts to solve a specific issue. This knowledge can be: (a) a recommendation for current and future language designers to make informed decisions (b) a reference for tool builders, e.g., by providing more precise or new refactoring analyses, (c) a guide for researchers to test new language features, or to carry out controlled programming experiments, and (d) a guide for developers for better practices. CCS Concepts: • Software and its engineering → General programming languages; Object oriented languages; Software libraries and repositories.

show abstract

Section: Related Workmentioning

confidence: 99%

“…Standard datasets have been widely used to support research in many other areas of computer science. For instance, the programming language and software engineering communities use datasets such as DaCapo [17] and Qualitas Corpus/XCorpus [18], [19] for benchmarking and empirical studies on source code. Sourcerer [20] is an infrastructure for large-scale collection and analysis of open source code.…”

Section: Related Workmentioning

confidence: 99%

GHTraffic: A Dataset for Reproducible Research in Service-Oriented Computing

Bhagya

Dietrich

Guesgen

et al. 2018

2018 IEEE International Conference on Web Services (ICWS)

Self Cite

View full text Add to dashboard Cite

We present GHTraffic, a dataset of significant size comprising HTTP transactions extracted from GitHub data and augmented with synthetic transaction data. The dataset facilitates reproducible research on many aspects of serviceoriented computing. This paper discusses use cases for such a dataset and extracts a set of requirements from these use cases. We then discuss the design of GHTraffic, and the methods and tool used to construct it. We conclude our contribution with some selective metrics that characterise GHTraffic.

show abstract

XCorpus – An executable Corpus of Java Programs.

Cited by 27 publications

References 10 publications

Scented since the beginning: On the diffuseness of test smells in automatically generated test code

Scented since the beginning: On the diffuseness of test smells in automatically generated test code

Casting about in the dark: an empirical study of cast operations in Java programs

GHTraffic: A Dataset for Reproducible Research in Service-Oriented Computing

Contact Info

Product

Resources

About