Abstract-To improve the programmability of multicores, several task-based programming models have recently been proposed. Inter-task dependencies have to be resolved by either the programmer or a software runtime system, increasing the programming complexity or the runtime overhead, respectively. In this paper we therefore propose the Nexus hardware task management support system. Based on the inputs and outputs of tasks, it dynamically detects dependencies between tasks and schedules ready tasks for execution. In addition, it provides fast and scalable synchronization. Experiments show that compared to a software runtime system, Nexus improves the task throughput by a factor of 54 times. As a consequence much finer-grained tasks and/or many more cores can be efficiently employed. For example, for H.264 decoding, which has an average task size of 8.1 , Nexus scales up to more than 12 cores, while when using the software approach, the scalability saturates at below three cores.