“…In addition, several parallel programming frameworks exist [11,17,23,29,38,44,45,47] that enable the compilation of domain-specific languages on GPUs. Lift [26,46] extends its existing data parallel primitive types to accommodate loop tiling (e.g., slide,pad) and its low-level OpenCL with local memory (e.g., toLocal) allocation for stencil computations.…”