Performance optimization for large-scale applications has recently become more important as computation continues to move towards data centers. Data-center applications are generally very large and complex, which makes code layout an important optimization to improve their performance. This has motivated recent investigation of practical techniques to improve code layout at both compile time and link time. Although post-link optimizers had some success in the past, no recent work has explored their benefits in the context of modern data-center applications.In this paper, we present BOLT, a post-link optimizer built on top of the LLVM framework. Utilizing samplebased profiling, BOLT boosts the performance of real-world applications even for highly optimized binaries built with both feedback-driven optimizations (FDO) and link-time optimizations (LTO). We demonstrate that post-link performance improvements are complementary to conventional compiler optimizations, even when the latter are done at a whole-program level and in the presence of profile information. We evaluated BOLT on both Facebook data-center workloads and open-source compilers. For data-center applications, BOLT achieves up to 8.0% performance speedups on top of profile-guided function reordering and LTO. For the GCC and Clang compilers, our evaluation shows that BOLT speeds up their binaries by up to 20.4% on top of FDO and LTO, and up to 52.1% if the binaries are built without FDO and LTO.
In face of the high number of different hardware platforms we need to program with Internet-of-Things (IoT), virtual machines (VMs) pose as a promising technology to allow a program once, deploy everywhere strategy. Unfortunately, existing VMs are either too heavy or use a stripped-down version to work on resource-constrained IoT devices. We present COISA, a compact virtual platform that relies on OpenISA, an instruction set architecture (ISA) that strives for easy emulation, to allow a single program to be deployed on many platforms, including tiny microcontrollers. By exploring the benefits of using a concrete ISA as our VM language, our experimental results indicate that COISA is easily portable and is capable of running unmodified guest applications in highly heterogeneous host platforms, including one with only 2 kB of RAM. For time-critical IoT applications on constrained platforms where extracting performance is of paramount importance, we propose the use of cloud-assisted translations, which employ static binary translation to deliver a binary fully converted to the native ISA used in the IoT device.HANDLING IOT PLATFORM HETEROGENEITY WITH COISA Our previous experience with ISA emulation and design [9,33,34] have shown us that a clean ISA, similar to MIPS, allows us to build a high-performance emulator capable of emulating guest
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.