“…There is a rich literature describing efforts to efficiently implement stencil computations on CPUs [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [15], [16], [12] and GPUs [13], [14], [17], [18], [19], [22], [23]. We discuss the most related efforts below.…”