Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the potential of 3D integration by enhancing MemPool, an open-source manycore design with 256 cores and a shared pool of L1 scratchpad memory connected with a low-latency interconnect. MemPool's baseline 2D design is severely limited by routing congestion and wire propagation delay, making the design ideal for 3D integration. In architectural terms, we increase MemPool's scratchpad memory capacity beyond the sweet spot for 2D designs, improving performance in a common digital signal processing kernel. We propose a 3D MemPool design that leverages a smart partitioning of the memory resources across two layers to balance the size and utilization of the stacked dies. In this paper, we co-explore the architectural and the technology parameter spaces by analyzing the power, performance, area, and energy efficiency of MemPool instances in 2D and 3D with 1 MiB, 2 MiB, 4 MiB, and 8 MiB of scratchpad memory in a commercial 28 nm technology node. We observe a performance gain of 9.1 % when running a matrix multiplication on the MemPool-3D design with 4 MiB of scratchpad memory compared to the MemPool 2D counterpart. In terms of energy efficiency, we can implement the MemPool-3D instance with 4 MiB of L1 memory on an energy budget 15 % smaller than its 2D counterpart, and even 3.7 % smaller than the MemPool-2D instance with one-fourth of the L1 scratchpad memory capacity.
Surge in compute-demand in consumer products, mobile phones, auto mobiles, datacenters for high performance computing (HPC) applications brings in major thermal challenges. This stems from growth in transistor density over the years and the associated power density increase. Advanced packaging techniques like 2.5D and 3D integration have a compounding effect. Hitting the thermal limits, not only affects the raw performance, power but also limits reliability of the product. Therefore, it has become necessary to foresee appropriate thermal solutions for target applications early in product development phase during thermal/power planning to assess viability of technology choices. In this paper, we assess the temperature distribution & anticipate cooling needs for future thermally-limited SOCs in advanced Angstrom nodes (A14 & A5). Thermal resistance breakdown from multiple sources is carried out to decouple contributions so as to explore possibility of a co-optimization of chip-package-cooling system. Some of the insights from our analysis could aid system software to do thermal aware job scheduling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.