Jinchun Kim scite author profile

Modern computational workloads require abundant thread level parallelism (TLP), necessitating highly-parallel, manycore accelerators such as General Purpose Graphics Processing Units (GPGPUs). GPGPUs place a heavy demand on the on-chip interconnect between the many cores and a few memory controllers (MCs). Thus, traffic is highly asymmetric, impacting on-chip resource utilization and system performance. Here, we analyze the communication demands of typical GPGPU applications, and propose efficient Network-on-Chip (NoC) designs to meet those demands. We show that the proposed schemes improve performance by up to 64.7%. Compared to the best of class prior work, our VC monopolizing and partitioning schemes improve performance by 25%.

show abstract

B-Fetch: Branch Prediction Directed Prefetching for Chip-Multiprocessors

Kadjo

Kim

Sharma³

et al. 2014

View full text Add to dashboard Cite

For decades, the primary tools in alleviating the "Memory Wall" have been large cache hierarchies and data prefetchers. Both approaches, become more challenging in modern, Chip-multiprocessor (CMP) design. Increasing the last-level cache (LLC) size yields diminishing returns in terms of performance per Watt; given VLSI power scaling trends, this approach becomes hard to justify. These trends also impact hardware budgets for prefetchers. Moreover, in the context of CMPs running multiple concurrent processes, prefetching accuracy is critical to prevent cache pollution effects. These concerns point to the need for a light-weight prefetcher with high accuracy. Existing data prefetchers may generally be classified as low-overhead and low accuracy (Next-n, Stride, etc.) or high-overhead and high accuracy (STeMS, ISB). We propose B-Fetch: a data prefetcher driven by branch prediction and effective address value speculation. B-Fetch leverages control flow prediction to generate an expected future path of the executing application. It then speculatively computes the effective address of the load instructions along that path based upon a history of past register transformations. Detailed simulation using a cycle accurate simulator shows a geometric mean speedup of 23.4% for single-threaded workloads, improving to 28.6% for multi-application workloads over a baseline system without prefetching. We find that B-Fetch outperforms an existing "best-of-class" light-weight prefetcher under singlethreaded and multiprogrammed workloads by 9% on average, with 65% less storage overhead.

show abstract

Dynamic Memory Pressure Aware Ballooning

Kim

Fedorov

Gratz

et al. 2015

View full text Add to dashboard Cite

Speculative paging for future NVM storage

Fedorov

Kim

Qin

et al. 2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jinchun Kim

Path confidence based lookahead prefetching

Bandwidth-efficient on-chip interconnect designs for GPGPUs

B-Fetch: Branch Prediction Directed Prefetching for Chip-Multiprocessors

Dynamic Memory Pressure Aware Ballooning

Speculative paging for future NVM storage

Contact Info

Product

Resources

About