Improving the cache locality of memory allocation

Grunwald, Dirk; Zorn, Benjamin G.; Henderson, Robert G.

doi:10.1145/155090.155107

Cited by 63 publications

(19 citation statements)

References 15 publications

(6 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Prolangs [Ryder et al, 2001], PtrDist [Zhao et al, 2005] and MallocBench [Grunwald et al, 1993]. As we show in Chapter 4, our analysis is linear on the size of programs.…”

Section: Summary Of Experimental Resultsmentioning

confidence: 85%

Symbolic range analysis of pointers

Paisante

Maalej

Barbosa

et al. 2016

Proceedings of the 2016 International Symposium on Code Generation and Optimization

View full text Add to dashboard Cite

Alias analysis is one of the most fundamental techniques that compilers use to optimize languages with pointers. However, in spite of all the attention that this topic has received, the current state-of-the-art approaches inside compilers still face challenges regarding precision and speed. In particular, pointer arithmetic, a key feature in C and C++, is yet to be handled satisfactorily. This work presents a new alias analysis algorithm to solve this problem. The key insight of our approach is to combine alias analysis with symbolic range analysis. This combination lets us disambiguate elds within arrays and structs, eectively achieving more precision than traditional algorithms. To validate our technique, we have implemented it on top of the LLVM compiler. Tests on a vast suite of benchmarks show that we can disambiguate several kinds of C idioms that current state-of-the-art analyses cannot deal with. In particular, we can disambiguate 1.35x more queries than the alias analysis currently available in LLVM. Furthermore, our analysis is very fast: we can go over one million assembly instructions in 10 seconds.

show abstract

“…Prolangs [Ryder et al, 2001], PtrDist [Zhao et al, 2005] and MallocBench [Grunwald et al, 1993]. As we show in Chapter 4, our analysis is linear on the size of programs.…”

Section: Summary Of Experimental Resultsmentioning

confidence: 85%

Symbolic range analysis of pointers

Paisante

Maalej

Barbosa

et al. 2016

Proceedings of the 2016 International Symposium on Code Generation and Optimization

View full text Add to dashboard Cite

show abstract

“…We use the single-threaded applications from Wilson and Johnstone, and Grunwald and Zorn [12,19]: espresso, an optimizer for programmable logic arrays; Ghostscript, a PostScript interpreter; LRUsim, a locality analyzer, and p2c, a Pascal-to-C translator. We chose these programs because they are allocation-intensive and have single-threaded benchmarks [12,19] Pascal-to-C translator multithreaded benchmarks threadtest each thread repeatedly allocates and then deallocates 100,000/P objects shbench [26] each thread allocates and randomly frees random-sized objects Larson [22] simulates a server: each thread allocates and deallocates objects, and then transfers some objects to other threads to be freed active-false tests active false sharing avoidance passive-false tests passive false sharing avoidance BEMengine [7] object-oriented PDE solver Barnes-Hut [1, 2] n-body particle solver Table 1: Single-and multithreaded benchmarks used in this paper.…”

Section: Resultsmentioning

confidence: 99%

“…If there is not, Hoard checks the global heap (heap 0) for a superblock. If there is one, Hoard transfers it to heap i, adding the number of bytes in use in the superblock s.u to ui, and the total number of bytes in the superblock S to ai (lines [10][11][12][13][14]. If there are no superblocks in either heap i or heap 0, Hoard allocates a new superblock and inserts it into heap i (line 8).…”

Section: Allocationmentioning

confidence: 99%

Hoard

Berger

McKinley

Blumofe

et al. 2000

SIGOPS Oper. Syst. Rev.

View full text Add to dashboard Cite

Parallel, multithreaded C and C++ programs such as web servers, database managers, news servers, and scientific applications are becoming increasingly prevalent. For these applications, the memory allocator is often a bottleneck that severely limits program performance and scalability on multiprocessor systems. Previous allocators suffer from problems that include poor performance and scalability, and heap organizations that introduce false sharing. Worse, many allocators exhibit a dramatic increase in memory consumption when confronted with a producer-consumer pattern of object allocation and freeing. This increase in memory consumption can range from a factor of P (the number of processors) to unbounded memory consumption. This paper introduces Hoard, a fast, highly scalable allocator that largely avoids false sharing and is memory efficient. Hoard is the first allocator to simultaneously solve the above problems. Hoard combines one global heap and per-processor heaps with a novel discipline that provably bounds memory consumption and has very low synchronization costs in the common case. Our results on eleven programs demonstrate that Hoard yields low average fragmentation and improves overall program performance over the standard Solaris allocator by up to a factor of 60 on 14 processors, and up to a factor of 18 over the next best allocator we tested.

show abstract

“…We use the latter suite of benchmarks both because they are widely used in memory management studies [3,19,22], and because their high allocation-intensity stresses memory management performance. For all experiments, we fix Exterminator's heap multiplier (value of M) at 2.…”

Section: Exterminator Runtime Overheadmentioning

confidence: 99%

Exterminator

Novark

Berger

Zorn

2007

Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation

View full text Add to dashboard Cite

Programs written in C and C++ are susceptible to memory errors, including buffer overflows and dangling pointers. These errors, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously costly to repair. Tracking down their location in the source code is difficult, even when the full memory state of the program is available. Once the errors are finally found, fixing them remains challenging: even for critical security-sensitive bugs, the average time between initial reports and the issuance of a patch is nearly one month.We present Exterminator, a system that automatically corrects heap-based memory errors without programmer intervention. Exterminator exploits randomization to pinpoint errors with high precision. From this information, Exterminator derives runtime patches that fix these errors both in current and subsequent executions. In addition, Exterminator enables collaborative bug correction by merging patches generated by multiple users. We present analytical and empirical results that demonstrate Exterminator's effectiveness at detecting and correcting both injected and real faults.

show abstract

Improving the cache locality of memory allocation

Cited by 63 publications

References 15 publications

Symbolic range analysis of pointers

Symbolic range analysis of pointers

Hoard

Exterminator

Contact Info

Product

Resources

About