“…GPU architecture. Other related work optimizes various aspects of the GPU architecture, e.g., warp scheduling [33], [39], [43], [56], [66], L1 cache management [31], [59], [65], [68], register file design [3], [30], [32], NoC optimization [10], [35], [73], [77], [78], and SM resource virtualization [64], [72]. Recent work also provides approaches for efficient multitasking in GPUs [4], [52], [53], [62], [67], [71], [76], virtual memory management [9], and design considerations for multi-module GPUs [8], [45].…”