Empirical software metrics for benchmarking of verification tools

Demyanova, Yulia; Pani, Thomas; Veith, Helmut; Zuleger, Florian

doi:10.1007/s10703-016-0264-5

Cited by 28 publications

(30 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The advantage of using a diverse set of models is that we can identify the most suitable application areas. Furthermore, we compare lower lever parameters of CEGAR as opposed to most experiments in the literature [11,19,36,37], where different algorithms or tools are compared. We formulate and address a research question related to the effectiveness and efficiency of each of our contributions.…”

Section: Experimental Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Strategies for CEGAR-Based Model Checking

Hajdu

Micskei

2019

J Autom Reasoning

View full text Add to dashboard Cite

Automated formal verification is often based on the Counterexample-Guided Abstraction Refinement (CEGAR) approach. Many variants of CEGAR have been developed over the years as different problem domains usually require different strategies for efficient verification. This has lead to generic and configurable CEGAR frameworks, which can incorporate various algorithms. In our paper we propose six novel improvements to different aspects of the CEGAR approach, including both abstraction and refinement. We implement our new contributions in the Theta framework allowing us to compare them with state-of-the-art algorithms. We conduct an experiment on a diverse set of models to address research questions related to the effectiveness and efficiency of our new strategies. Results show that our new contributions perform well in general. Moreover, we highlight certain cases where performance could not be increased or where a remarkable improvement is achieved.

show abstract

Section: Experimental Evaluationmentioning

confidence: 99%

“…Experimental evaluation There are many works in the literature that focus on experimental evaluation and comparison of model checking algorithms [11,19,36,37]. However, they usually focus on a certain domain (e.g., SV-COMP).…”

Section: Multiple Refinements For a Counterexamplementioning

confidence: 99%

Efficient Strategies for CEGAR-Based Model Checking

Hajdu

Micskei

2019

J Autom Reasoning

View full text Add to dashboard Cite

show abstract

“…The metrics to classify programs proposed in [13] are related to three different aspects: variable roles, loop patterns, and control flow.…”

Section: Program Metricsmentioning

confidence: 99%

“…In Section 2 we summarize the program metrics exploited in [13]. We propose corresponding metrics for term rewrite systems in Section 3.…”

Section: Introductionmentioning

confidence: 99%

Smarter Features, Simpler Learning?

Winkler

Moser

2019

Electron. Proc. Theor. Comput. Sci.

View full text Add to dashboard Cite

Earlier work on machine learning for automated reasoning mostly relied on simple, syntactic features combined with sophisticated learning techniques. Using ideas adopted in the software verification community, we propose the investigation of more complex, structural features to learn from. These may be exploited to either learn beneficial strategies for tools, or build a portfolio solver that chooses the most suitable tool for a given problem. We present some ideas for features of term rewrite systems and theorem proving problems.

show abstract

“…The reason for this is that the analysis time of a SAT or SMT problem can vary significantly between two problems of the same size or between two solvers, due to the nature of modern solver algorithms [15]. For instance, compare adpcmencode (program size 4, 911 steps, timeout after 1 minute) with ndes (program size 5, 727 steps, solved in less than one second).…”

Section: Reduction Of Computational Effortmentioning

confidence: 99%

Scalable and precise estimation and debugging of the worst-case execution time for analysis-friendly processors: a comeback of model checking

Becker¹,

Metta²,

Venkatesh³

et al. 2018

Int J Softw Tools Technol Transfer

View full text Add to dashboard Cite

Estimating the Worst-Case Execution Time (WCET) of an application is an essential task in the context of developing real-time or safety-critical software, but it is also a complex and error-prone process. Conventional approaches require at least some manual inputs from the user, such as loop bounds and infeasible path information, which are hard to obtain and can lead to unsafe results if they are incorrect. This is aggravated by the lack of a comprehensive explanation of the WCET estimate, i.e., a specific trace showing how WCET was reached. It is therefore hard to spot incorrect inputs and hard to improve the worst-case timing of the application. Meanwhile, modern processors have reached a complexity that refutes analysis and puts more and more burden on the practitioner. In this article we show how all of these issues can be significantly mitigated or even solved, if we use processors that are amenable to WCET analysis. We define and identify such processors, and then we propose an automated tool set which estimates a precise WCET without unsafe manual inputs, and also reconstructs a maximum-detail view of the WCET path that can be examined in a debugger environment. Our approach is based on Model Checking, which however is known to scale badly with growing application size. We address this issue by shifting the analysis to source code level, where source code transformations can be applied that retain the timing behavior, but reduce the complexity. Our experiments show that fast and precise estimates can be achieved with Model Checking, that its scalability can even exceed current approaches, and that new opportunities arise in the context of "timing debugging".

show abstract

Empirical software metrics for benchmarking of verification tools

Cited by 28 publications

References 23 publications

Efficient Strategies for CEGAR-Based Model Checking

Efficient Strategies for CEGAR-Based Model Checking

Smarter Features, Simpler Learning?

Scalable and precise estimation and debugging of the worst-case execution time for analysis-friendly processors: a comeback of model checking

Contact Info

Product

Resources

About