Current high-performance multicore processors provide users with a non-uniform memory access model (NUMA). These systems perform better when threads access data on memory banks next to the core where they run. However, ensuring data locality is di cult. In this paper, we propose compiler analyses and code generation methods to support a lightweight runtime system that dynamically migrates memory pages to improve data locality. Our technique combines static and dynamic analyses and is capable of identifying the most promising pages to migrate. Statically, we infer the size of arrays, plus the amount of reuse of each memory access instruction in a program. These estimates rely on a simple, yet accurate, trip count predictor of our own design. This knowledge let's us build templates of dynamic checks, to be filled with values known only at runtime. These checks determine when it is profitable to migrate data closer to the processors where this data is used. Our static analyses are quadratic on the number of variables in a program, and the dynamic checks are O(1) in practice. Our technique does not require any form of user intervention, neither the support of a third-party middleware, nor modifications in the operating system's kernel. We have applied our technique on several parallel algorithms, which are completely oblivious to the asymmetric memory topology, and have observed speedups of up to 4x, compared to static heuristics. We compare our approach against Minas, a middleware that supports NUMA-aware data allocation, and show that we can outperform it by up to 50% in some cases.
A análise de largura de variáveis determina o maior e menor valores que cada variável inteira de um programa pode assumir durante a sua execução. Tal técnica é de suma importância para detectar vulnerabilidades em programas mas, até o momento, não existe abordagem que aplique essa análise em sistemas distribuídos. Tal omissão é séria, uma vez que esse tipo de sistema é alvo comum de ataques de software. O objetivo deste artigo é preencher tal lacuna. Valendo-nos de um algoritmo recente para inferir protocolos de comunicação, nós projetamos, implementamos e testamos uma análise de largura de variáveis para sistemas distribuídos. Nosso algoritmo, ao prover uma visão holística do sistema distribuído, é mais preciso que analisar cada parte daquele sistema separadamente. Demonstramos tal fato via uma série de exemplos e experimentos realizados sobre os programas presentes em SPEC CPU 2006. Um protótipo de nossa ferramenta, implementado sobre o compilador LLVM, está disponível para escrutínio.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.