We describe design details of a LightWeight Processing migration-NUMA architecture, a novel high performance system design that provides hardware support for a partitioned global address space, migrating subjects, and word level synchronization primitives. Using the architectural definition, combinations of structures are shown to work together to carry out basic actions such as address translation, migration, in-memory synchronization, and work management. We present results from simulation of microkernels showing that LWP-mNUMA compensates for latency with far greater memory access concurrency than possible on a conventional systems. In particular, several microkernels model tough, irregular access patterns that have limited speedups -in certain problem areas -to dozens of conventional processors. On these, results show speedup increasing up to 1024 multicore mNUMA processing nodes, running over 1 million threadlets.LightWeight Processing migration-NUMA refers to this synergistic combination of massively parallel lightweight multithreading, migration based consistency, and memory extensions for word level inter-thread synchronization.LWP-mNUMA draws on previous investigation [14] into architectural techniques to support applications which do not map well to distributed memory systems. In the process of mapping graph traversal applications [17] onto a novel architecture [27], a new architecture, programming model, and algorithmic approach emerged simultaneously. This paper presents initial results of this co-design.This paper begins with a description (section 2) of the programming environment supported by mNUMA hardware. Next, the architecture itself is presented in two sections. Section 3 presents major hardware components and memory structures. Then, section 4 explains how these components interact to ensure correct and efficient operation of memory access, thread management, and synchronization. Section 5 presents a set of experimental results. After the reader has seen how mNUMA fits together, section 6 presents related work. Section 7 concludes with summary and future work.
Mobile-Subjective Programming ModelAll computation and communication in an mNUMA system occurs by the action of subjects. Subjects are strictly runtime entities, dynamically created from code objects specified by a programmer. Subjectivity results from always placing executing objects in ex-A0