3D stacking has been viewed as a breakthrough solution for increasing performance in multi-core architectures. The hope is to solve some of the main issues in current multicore architectures: external memory pressure and latency; I/O bottleneck; communication power consumption. In this paper, some advances of this field of research are shown, starting with a WIDEIO experience on a real chip for solving DRAM accesses issue. The integration of a 512 bit-width bus is demonstrated in a Network-on-Chip (NoC) multi-core framework and the resulting performance based on a 65nm prototype with 10µm diameter Through Silicon Vias (TSV). The potentiality of 3D scaling thanks to 3D asynchronous Network-on-Chip implementation is then shown. Finally, an innovative 3D stacked distributed cache strategy aimed at lowering memory latency and external memory bandwidth requirements is presented. This new memory partitioning demonstrates the efficiency of 3D stacking to rethink architectures for addressing multi-core scaling challenges.