Anders Gidenstam scite author profile

We present an efficient and practical lock-free method for semiautomatic (application-guided) memory reclamation based on reference counting, aimed for use with arbitrary lock-free dynamic data structures. The method guarantees the safety of local as well as global references, supports arbitrary memory reuse, uses atomic primitives that are available in modern computer systems, and provides an upper bound on the amount of memory waiting to be reclaimed. To the best of our knowledge, this is the first lock-free method that provides all of these properties. We provide analytical and experimental study of the method. The experiments conducted have shown that the method can also provide significant performance improvements for lock-free algorithms of dynamic data structures that require strong memory management.

show abstract

Efficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting

Gidenstam

Papatriantafilou

Sundell

et al.

View full text Add to dashboard Cite

We present an efficient and practical lock-free implementation of a memory reclamation scheme based on reference counting, aimed for use with arbitrary lock-free dynamic data structures. The scheme guarantees the safety of local as well as global references, supports arbitrary memory reuse, uses atomic primitives which are available in modern computer systems and provides an upper bound on the memory prevented for reuse. To the best of our knowledge, this is the first lock-free algorithm that provides all of these properties. Experimental results indicate significant performance improvements for lock-free algorithms of dynamic data structures that require strong garbage collection support.1 A compare-and-swap operation that can atomically update two adjacent memory words, which is available in some 32-bit and very few 64-bit architectures.

show abstract

Multiword atomic read/write registers on multiprocessor systems

Larsson

Gidenstam²,

et al. 2009

ACM J. Exp. Algorithmics

View full text Add to dashboard Cite

Modern multiprocessor systems offer advanced synchronization primitives, built in hardware, to support the development of efficient parallel algorithms. In this article, we develop a simple and efficient algorithm, the READERSFIELD algorithm, for atomic registers (variables) of arbitrary length. The simplicity and better complexity of the algorithm is achieved via the utilization of two such common synchronization primitives. In this article, we also experimentally evaluate the performance of our algorithm, together with lock-based approaches and a practical, previously known algorithm that is based only on read and write primitives. The experimental evaluation is performed on three well-known parallel architectures. This evaluation clearly shows that both algorithms are practical and that as the size of the register increases the READERSFIELD algorithm performs better, following its analytical evaluation.

show abstract

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

Gidenstam

Sundell

Tsigas

2010

View full text Add to dashboard Cite

Abstract. A lock-free FIFO queue data structure is presented in this paper. The algorithm supports multiple producers and multiple consumers and weak memory models. It has been designed to be cache-aware and work directly on weak memory models. It utilizes the cache behavior in concert with lazy updates of shared data, and a dynamic lock-free memory management scheme to decrease unnecessary synchronization and increase performance. Experiments on an 8-way multi-core platform show significantly better performance for the new algorithm compared to previous fast lock-free algorithms.

show abstract

Allocating Memory in a Lock-Free Manner

Gidenstam

Papatriantafilou

Tsigas

2005

View full text Add to dashboard Cite

Abstract. The potential of multiprocessor systems is often not fully realized by their system services. Certain synchronization methods, such as lock-based ones, may limit the parallelism. It is significant to see the impact of wait/lock-free synchronization design in key services for multiprocessor systems, such as the memory allocation service. Efficient, scalable memory allocators for multithreaded applications on multiprocessors is a significant goal of recent research projects.We propose a lock-free memory allocator, to enhance the parallelism in the system. Its architecture is inspired by Hoard, a successful concurrent memory allocator, with a modular, scalable design that preserves scalability and helps avoiding false-sharing and heap blowup. Within our effort on designing appropriate lock-free algorithms to construct this system, we propose a new non-blocking data structure called flat-sets, supporting conventional "internal" operations as well as "inter-object" operations, for moving items between flat-sets.We implemented the memory allocator in a set of multiprocessor systems (UMA Sun Enterprise 450 and ccNUMA Origin 3800) and studied its behaviour. The results show that the good properties of Hoard w.r.t. false-sharing and heap-blowup are preserved, while the scalability properties are enhanced even further with the help of lock-free synchronization.

show abstract

A lock-free algorithm for concurrent bags

Sundell

Gidenstam

Papatriantafilou

et al. 2011

View full text Add to dashboard Cite

A lock-free bag data structure supporting unordered buffering is presented in this paper. The algorithm supports multiple producers and multiple consumers, as well as dynamic collection sizes. To handle concurrency efficiently, the algorithm was designed to thrive for disjoint-access-parallelism for the supported semantics. Therefore, the algorithm exploits a distributed design combined with novel techniques for handling concurrent modifications of linked lists using double marks, detection of total emptiness, and efficient memory management with hazard pointer handover. Experiments on a 24-way multi-core platform show significantly better performance for the new algorithm compared to previous algorithms of relevance.

show abstract

NBmalloc: Allocating Memory in a Lock-Free Manner

2009

View full text Add to dashboard Cite

Efficient, scalable memory allocation for multithreaded applications on multiprocessors is a significant goal of recent research. In the distributed computing literature it has been emphasized that lock-based synchronization and concurrencycontrol may limit the parallelism in multiprocessor systems. Thus, system services that employ such methods can hinder reaching the full potential of these systems. A natural research question is the pertinence and the impact of lock-free concurrency control in key services for multiprocessors, such as in the memory allocation service, which is the theme of this work. We show the design and implementation of NBMALLOC, a lock-free memory allocator designed to enhance the parallelism in the system. The architecture of NBMALLOC is inspired by Hoard, a well-known concurrent memory allocator, with modular design that preserves scalability and helps avoiding false-sharing and heap-blowup. Within our effort to design appropriate lockfree algorithms for NBMALLOC, we propose and show a lock-free implementation of a new data structure, flat-set, supporting conventional "internal" set operations as well as "inter-object" operations, for moving items between flat-sets. The design of NBMALLOC also involved a series of other algorithmic problems, which are discussed in the paper. Further, we present the implementation of NBMALLOC and a

show abstract

A Consistency Framework for Iteration Operations in Concurrent Data Structures

Nikolakopoulos

Gidenstam

Papatriantafilou

et al. 2015

View full text Add to dashboard Cite

Abstract-Concurrent data structures provide the means to multi-threaded applications to share data. Data structures come with a set of predefined operations, specified by the semantics of the data structure. In the literature and in several contemporary commonly used programming environments, the notion of iteration has been introduced for collection data structures, as a bulk operation enhancing the native set of operations. Iterations in several of these contexts have been treated as sequential in nature and may provide weak consistency guarantees when running concurrently with the native operations of the data structures.In this work we study iterations in concurrent data structures in the context of concurrency with the native operations and the guarantees that they provide. Besides linearizability, we propose a set of consistency specifications for such bulk operations, including also concurrency-aware properties by building on Lamport's systematic definitions for registers. Furthermore, by using queues and composite registers as casestudies of underlying objects, we provide a set of constructions of iteration operations, satisfying the properties and showing containment relations. Besides the trade-off between consistency and throughput, we point out and study tradeoff between the overhead of the bulk operation and possible support (helping) by the native operations of the data structure.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.