Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for concurrent, atomic update of shared data structures: (1) preemption-safe locking and (2) non-blocking (lock-free) algorithms. Preemption-safe locking requires kernel support. Non-blocking algorithms generally require a universal atomic primitive such as compare-and-swap or load-linked/store-conditional, and are widely regarded as inefficient.We evaluate the performance of preemption-safe lock-based and non-blocking implementations of important data structures-queues, stacks, heaps, and counters-including non-blocking and lock-based queue algorithms of our own, in micro-benchmarks and real applications on a 12-processor SGI Challenge multiprocessor. Our results indicate that our non-blocking queue consistently outperforms the best known alternatives, and that data-structure-specific non-blocking algorithms, which exist for queues, stacks, and counters, can work extremely well. Not only do they outperform preemption-safe lock-based algorithms on multiprogrammed machines, they also outperform ordinary locks on dedicated machines. At the same time, since general-purpose non-blocking techniques do not yet appear to be practical, preemption-safe locks remain the preferred alternative for complex data structures: they outperform conventional locks by significant margins on multiprogrammed systems.