Principal Component Analysis (PCA) is a significant algorithm that has been applied in various applications. The computations of this algorithm are based on the matrix manipulation including addition, subtraction, and multiplication. This paper proposed a hardware implementation for PCA by using the Networks-on-Chip (NoC) concept as the design paradigm. In the proposed procedure a 2D mesh topology is selected first while PCA algorithm is represented in form of a task graph, which can be divided into 2 levels. In the higher level, tasks are matrix-based algorithms, such as mean computation, which in turn will be mapped onto the NoC topology. Since the NoC is scalable, an appropriate size of 2D mesh will be selected so that the timing of each task in the task chain is matched resulting in an effective pipeline in the higher level. The simulation using SystemC with NoC modeled as Cycle-Accurate showed that an average throughput of 2.20 Gbps and latency of 0.025 cycle/flit were achieved while computing the matrix multiplication of size 112x92 with 81 Processing Element (PE) organize as 9x9 while running with clock speed of 100 MHz.