Alastair Murray scite author profile

Random program generation -fuzzing -is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested.We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing. CCS CONCEPTS• Software and its engineering → Software testing and debugging;

show abstract

Code transformation and instruction set extension

Murray

Bennett

Franke

et al. 2009

ACM Trans. Embed. Comput. Syst.

View full text Add to dashboard Cite

The demand for flexible embedded solutions and short time-to-market has led to the development of extensible processors that allow for customization through user-defined instruction set extensions (ISEs). These are usually identified from plain C sources. In this article, we propose a combined exploration of code transformations and ISE identification. The resulting performance of such a combination has been measured on two benchmark suites. Our results demonstrate that combined code transformations and ISEs can yield average performance improvements of 49%. This outperforms ISEs when applied in isolation, and in extreme cases yields a speed-up of 2.85.

show abstract

Compute Aorta

Murray

Crawford

2020

View full text Add to dashboard Cite

Compiling for automatically generated instruction set extensions

Murray

Franke

2012

View full text Add to dashboard Cite

The automatic generation of instruction set extensions (ISEs) to provide application-specific acceleration for embedded processors has been a productive area of research in recent years. The use of automatic algorithms, however, results in instructions that are radically different from those found in conventional ISAs. This has resulted in a gap between the hardware's capabilities and the compiler's ability to exploit them. This paper proposes an innovative high-level compiler pass that uses subgraph isomorphism checking to exploit these complex instructions. Our extended code generator also enables the reuse of ISEs designed for one application in another, which may be a newer version of the same application or a different one from the same domain. Operating in a separate pass permits computationally expensive techniques to be applied that are uniquely suited for mapping complex instructions, but unsuitable for conventional instruction selection. We demonstrate that this targeted use of an expensive algorithm effectively controls overall compilation time. The existing, mature, compiler back-end can then handle the remainder of the compilation. Instructions are automatically produced for 179 benchmarks, resulting in a total of 1965 unique instructions. The high-level pass integrated into the open-source GCC compiler is able to use the instructions produced for each benchmark to obtain an average speed-up of 1.26 for the ENCORE extensible processor.

show abstract

Popcorn

Barbalace

Sadini

Ansary

et al. 2015

View full text Add to dashboard Cite

The recent possibility of integrating multiple-OS-capable, high-core-count, heterogeneous-ISA processors in the same platform poses a question: given the tight integration between system components, can a shared memory programming model be adopted, enhancing programmability? If this can be done, an enormous amount of existing code written for shared memory architectures would not have to be rewritten to use a new programming paradigm (e.g., code offloading) that is often very expensive and error prone. We propose a new software architecture that is composed of an operating system and a compiler framework to run ordinary shared memory applications, written for homogeneous machines, on OS-capable heterogeneous-ISA machines. Applications run transparently amongst different ISA processors while exploiting the most optimized instruction set for each code block. We have implemented and tested our system, called Popcorn, on a multi-core Intel Xeon machine with a PCIe Intel Xeon Phi to demonstrate the viability of our approach. Application execution on Popcorn demonstrates to be up to 52% faster than the most performant native execution on Linux, on either Xeon or Xeon Phi, while removing the burden of the programmer having to adopt a different programming model than shared memory on a heterogeneous system. When compared to an offloading programming model, Popcorn is shown to be up to 6.2 times faster.

show abstract

Kernel composition in SYCL

Potter

Keir

Bradford

et al. 2015

View full text Add to dashboard Cite

Fast source-level data assignment to dual memory banks

Murray

Franke

2008

View full text Add to dashboard Cite

Due to their streaming nature memory bandwidth is critical for most digital signal processing applications. To accommodate for these bandwidth requirements digital signal processors are typically equipped with dual memory banks that enable simultaneous access to two operands if the data is partitioned appropriately. Fully automated and compiler integrated approaches to data partitioning and memory bank assignment, however, have found little acceptance by DSP software developers. This is partly due to their inflexibility and inability to cope with certain manual data pre-assignments, e.g. due to I/O constraints. In this paper we present a different and more flexible approach, namely source-level dual memory assignment where code generation targets DSP-C, a standardised C language extension widely supported by industrial C compilers for DSPs. Additionally, we present a novel partitioning algorithm based on soft colouring that is more efficient and scalable than the currently known best integer linear programming algorithm, whilst achieving competitive code quality. We have evaluated our scheme on an Analog Devices TigerSHARC DSP and achieved speedups of up to 1.57 on 13 UTDSP benchmarks.

show abstract

AIRA: A Framework for Flexible Compute Kernel Execution in Heterogeneous Platforms

Lyerly

Murray

Barbalace

et al. 2018

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alastair Murray

Compiler fuzzing through deep learning

Code transformation and instruction set extension

Compute Aorta

Compiling for automatically generated instruction set extensions

Popcorn

Kernel composition in SYCL

Fast source-level data assignment to dual memory banks

AIRA: A Framework for Flexible Compute Kernel Execution in Heterogeneous Platforms

Contact Info

Product

Resources

About