Using an interactive theorem prover to reason about programs involves a sequence of interactions where the user challenges the theorem prover with conjectures. Invariably, many of the conjectures posed are in fact false, and users often spend considerable effort examining the theorem prover's output before realizing this. We present a synergistic integration of testing with theorem proving, implemented in the ACL2 Sedan (ACL2s), for automatically generating concrete counterexamples. Our method uses the full power of the theorem prover and associated libraries to simplify conjectures; this simplification can transform conjectures for which finding counterexamples is hard into conjectures where finding counterexamples is trivial. In fact, our approach even leads to better theorem proving, e.g., if testing shows that a generalization step leads to a false conjecture, we force the theorem prover to backtrack, allowing it to pursue more fruitful options that may yield a proof. The focus of the paper is on the engineering of a synergistic integration of testing with interactive theorem proving; this includes extending ACL2 with new functionality that we expect to be of general interest. We also discuss our experience in using ACL2s to teach freshman students how to reason about their programs.
No abstract
Abstract. Bitstate hashing in SPIN has proved invaluable in probabilistically detecting errors in large models, but in many cases, the number of omitted states is much higher than it would be if SPIN allowed more than two hash functions to be used. For example, adding just one more hash function can reduce the probability of omitting states at all from 99% to under 3%. Because hash computation accounts for an overwhelming portion of the total execution cost of bitstate verification with SPIN, adding additional independent hash functions would slow down the process tremendously. We present efficient ways of computing multiple hash values that, despite sacrificing independence, give virtually the same accuracy and even yield a speed improvement in the two hash function case when compared to the current SPIN implementation. Another key to accurate bitstate hashing is utilizing as much memory as is available. The current SPIN implementation is limited to only 512MB and allows only power-of-two granularity (256MB, 128MB, etc). However, using 768MB instead of 512MB could reduce the probability of a single omission from 20% to less than one chance in 10,000, which demonstrates the magnitude of both the maximum and the granularity limitation. We have modified SPIN to utilize any addressable amount of memory and use any number of efficiently-computed hash functions, and we present empirical results from extensive experimentation comparing various configurations of our modified version to the original SPIN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.