In this paper, a dataset of geotagged photos on a world-wide scale is presented. The dataset contains a sample of more than 14 million geotagged photos crawled from Flickr with the corresponding metadata. To guarantee the spatial representativeness of the dataset, a crawling approach based on the small-world phenomena and the Flickr friendship's graph is applied. Furthermore, the noisiness of user-provided tags is reduced through an automatic tag cleaning approach. To enable efficient retrieval, photos in the dataset are indexed based on their location information using quad-tree data structure. The dataset can assists different applications, especially, search-based automatic image annotation and reverse geotagging 1 .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.