In this paper, the state-of-the-art parallel computational model research is reviewed. We will introduce various models that were developed during the past decades. According to their targeting architecture features, especially memory organization, we classify these parallel computational models into three generations. These models and their characteristics are discussed based on three generations classification. We believe that with the ever increasing speed gap between the CPU and memory systems, incorporating non-uniform memory hierarchy into computational models will become unavoidable. With the emergence of multi-core CPUs, the parallelism hierarchy of current computing platforms becomes more and more complicated. Describing this complicated parallelism hierarchy in future computational models becomes more and more important. A semi-automatic toolkit that can extract model parameters and their values on real computers can reduce the model analysis complexity, thus allowing more complicated models with more parameters to be adopted. Hierarchical memory and hierarchical parallelism will be two very important features that should be considered in future model design and research.Keywords parallel computational models, hierarchical memory, hierarchical parallelism, three generations, memory model
BackgroundThe simplified and abstract description of a computer is called a "computational model". A computer architect, algorithm designer and program developer can use such a model as a basis to assess their work including the suitability of one computer architecture to various applications, the computation complexity of an algorithm and the potential performance of one program on various computers, etc. A good computational model can simplify the complicated work of the architect, algorithm designer and program developer while mapping their work effectively onto real computers. Thus, such computational model is sometimes also called "Bridging model" [1]. The bridging model between the sequential computer and algorithm designer/program developer is the Von Neumann and RAM (Random Access Machine) Model [2]. However, no commonly recognized bridging models are found between parallel computer and parallel programs, and no other model exists that can map a user's parallel program so smoothly onto parallel computers as the Von Neumann and RAM Model do. This situation is largely due to the immature parallel computer design, i.e., there are so many different architectures for parallel computers that change rapidly each year, and the greater demand on performance [3]; a clean and simplified description is almost impossible. However, the trend of parallel computer design is converging and a common parallel computer architecture model can be realized (such as cluster), and the communication (we have standard MPI interface) of parallel computing is not so interconnect network dependent, thus we have the BSP and LogP models [1,7].Based on the historical development of parallel computational models, we think they can be classified ...