“…They have become a viable alternative to harvesting data from the web at large [15][16][17]. Like the comparable works dealing with the automatic generation of training sets [13,15,18,19], our approach focuses on the task of capturing a range of diverse but consistent picture representatives for a given visual concept (like dog, car, etc.). In other words, our idea is to design a system that creates visual synsets [3], like in ImageNet, but with minimal intervention, and fast, dynamic, and customizable structure (for concepts outside WordNet [20]).…”