“…To extract local features, one of the commonly used descriptors is the SIFT descriptor [1,16], which is a sparse descriptor. On the other hand, dense descriptors such as DSIFT, Phow, DAISY, and enhanced SIFT descriptor based on regular or random dense sampling have become the state-of-the-art and outperformed the sparse SIFT descriptor in the recent literature [11,[17][18][19]32]. On this note, a myriad of image classification works have integrated visual words from dense descriptor and achieved remarkable results [11,17,19].…”