Technology scaling along with the ever evolving demand for media-rich software stacks have motivated the need for many-core platforms. With the increase in compute power and its inherent demand for high memory bandwidth comes the need for vast amounts of on-chip memory space. Thus, designers must carefully provision the memory real-estate to meet their application's needs. It has been shown in the embedded systems domain that both software controlled memories (e.g., scratchpad memories) and hardware-controlled memories (e.g., caches) have their pros and cons, some application domains such as multimedia fit very well in the software-controlled memory model, while other domains such as databases work well with caches. As a result, efficient memory management is extremely critical as it has a great impact on the system's power consumption and throughput. Traditional memory hierarchies primarily consist of SRAM-based onchip caches, however, with the emergence of non-volatile memories (NVMs) and mixed-criticality systems, on-chip memories will be heterogeneous, not only in type (cache vs. scratchpad) but also in technology (e.g., SRAM vs. NVM). This paper surveys the state of the art in memory subsystems for many-core platforms, and presents strategies for efficiently managing software-controlled memories in the many-core domain, while addressing the various challenges designers face in deploying such memory subsystems (e.g., sharing the memory resources, accounting for variations in the subsystem, etc.).