This paper introduces Cloud9, a platform for automated testing of real-world software. Our main contribution is the scalable parallelization of symbolic execution on clusters of commodity hardware, to help cope with path explosion. Cloud9 provides a systematic interface for writing "symbolic tests" that concisely specify entire families of inputs and behaviors to be tested, thus improving testing productivity. Cloud9 can handle not only single-threaded programs but also multi-threaded and distributed systems. It includes a new symbolic environment model that is the first to support all major aspects of the POSIX interface, such as processes, threads, synchronization, networking, IPC, and file I/O. We show that Cloud9 can automatically test real systems, like memcached, Apache httpd, lighttpd, the Python interpreter, rsync, and curl. We show how Cloud9 can use existing test suites to generate new test cases that capture untested corner cases (e.g., network stream fragmentation). Cloud9 can also diagnose incomplete bug fixes by analyzing the difference between buggy paths before and after a patch.
Conventional data warehouses employ the query-at-a-time model, which maps each query to a distinct physical plan. When several queries execute concurrently, this model introduces contention, because the physical plans-unaware of each other-compete for access to the underlying I/O and computation resources. As a result, while modern systems can efficiently optimize and evaluate a single complex data analysis query, their performance suffers significantly when multiple complex queries run at the same time.We describe an augmentation of traditional query engines that improves join throughput in large-scale concurrent data warehouses. In contrast to the conventional query-at-a-time model, our approach employs a single physical plan that can share I/O, computation, and tuple storage across all in-flight join queries. We use an "alwayson" pipeline of non-blocking operators, coupled with a controller that continuously examines the current query mix and performs run-time optimizations. Our design allows the query engine to scale gracefully to large data sets, provide predictable execution times, and reduce contention. In our empirical evaluation, we found that our prototype outperforms conventional commercial systems by an order of magnitude for tens to hundreds of concurrent queries.
Automatic Failure-Path Inference (AFPI) is an application-generic, automatic technique for dynamically discovering the failure dependency graphs of componentized Internet applications. AFPI's first phase is invasive, and relies on controlled fault injection to determine failure propagation; this phase requires no a priori knowledge of the application and takes on the order of hours to run. Once the system is deployed in production, the second, non-invasive phase of AFPI passively monitors the system, and updates the dependency graph as new failures are observed. This process is a good match for the perpetually-evolving software found in Internet systems; since no performance overhead is introduced, AFPI is feasible for live systems. We applied AFPI to J2EE and tested it by injecting Java exceptions into an e-commerce application and an online auction service. The resulting graphs of exception propagation are more detailed and accurate than what could be derived by time-consuming manual inspection or analysis of readily-available static application descriptions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.