The use of massively parallel SIMD array architectures is proliferating in the area of domain specific coprocessors. Even so, they have undergone few systematic empirical studies. The underlying problems include the size of the architecture space, the lack of portability of the test programs, and the inherent complexity of simulating up to hundreds of thousands of processing elements. We address the computational cost problem with a novel approach to trace-based simulation. Code is run on an abstract virtual machine to generate a coarse-grained trace, which is then refined through a series of transformations (a process we call trace compilation) wherein greater resolution is obtained with respect to the details of the target machine. We have found this technique to be one to two orders of magnitude faster than instruction-level simulation while still retaining much of the accuracy of the model. Furthermore, abstract machine traces must be regenerated for only a small fraction of the possible parameter combinations. Using virtual machine emulation and trace compilation also addresses program portability by allowing the user to code in a single data parallel language with a single compiler, regardless of the target architecture. This technique has already been used to generate significant results with respect to SIMD array architectures, a sample of which are presented here.
The use of massively parallel SIMD array architectures is proliferating in the area of domain specific coprocessors. Even so, they have undergone few systematic empirical studies. The underlying problems include the size of the architecture space, the lack of portability of the test programs, and the inherent complexity of simulating up to hundreds of thousands of processing elements. We address the computational cost problem with a novel approach to trace-based simulation. Code is run on an abstract virtual machine to generate a coarse-grained trace, which is then refined through a series of transformations (a process we call
trace compilation
) wherein greater resolution is obtained with respect to the details of the target machine. We have found this technique to be one to two orders of magnitude faster than instruction-level simulation while still retaining much of the accuracy of the model. Furthermore, abstract machine traces must be regenerated for only a small fraction of the possible parameter combinations. Using virtual machine emulation and trace compilation also addresses program portability by allowing the user to code in a single data parallel language with a single compiler, regardless of the target architecture. This technique has already been used to generate significant results with respect to SIMD array architectures, a sample of which are presented here.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.