“…As an example, assuming a stride value 16, for a memory access instruction issued across a warp of 32 threads, we will generate the following 32 memory indices: {0, 16,32,48,64,80,96,112,128,144,160,176,192,208,224,240,256,272,288,304,320,336,352,368,384,400,416,432,448,464, 480, 496}. Using Equation 3, we have the following bank access indices for the warp: {0, 8,16,24,0,8,16,24,0,8,16,24,0,8,16,24,0,8,16,24,0,8,16,24,0,8,16,…”