“…This parallel has been exploited in recent bag-of-keypoints approaches to visual categorization [6,27], unsupervised discovery of visual "topics" [24], and video retrieval [23]. More generally, representations based on local image features, or salient regions extracted by specialized interest operators, have shown promise for recognizing textures [13], different views of the same object [9,22], and different instances of the same object class [1,7,8,26]. For textures, appearance-based descriptors of salient local regions are clustered to form characteristic texture elements, or textons.…”