Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval 2008
DOI: 10.1145/1386352.1386363
|View full text |Cite
|
Sign up to set email alerts
|

World-scale mining of objects and events from community photo collections

Abstract: In this paper, we describe an approach for mining images of objects (such as touristic sights) from community photo collections in an unsupervised fashion. Our approach relies on retrieving geotagged photos from those web-sites using a grid of geospatial tiles. The downloaded photos are clustered into potentially interesting entities through a processing pipeline of several modalities, including visual, textual and spatial proximity. The resulting clusters are analyzed and are automatically classified into obj… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
240
0
9

Year Published

2008
2008
2012
2012

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 236 publications
(250 citation statements)
references
References 32 publications
(44 reference statements)
1
240
0
9
Order By: Relevance
“…In [10,11] tags and visual information together with geo-location are used for object (e.g. monuments) and event extraction.…”
Section: Multi-modal Analysis Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…In [10,11] tags and visual information together with geo-location are used for object (e.g. monuments) and event extraction.…”
Section: Multi-modal Analysis Approachesmentioning
confidence: 99%
“…However, more advanced techniques and applications [9,10,13,22,23,24,25] have also been presented, capable of processing more kinds of input modalities enabling the spatio-temporal and situational dimension. Scalability is addressed by a wide range of applications; however the amount of works enabling the real-time aspect is still very limited.…”
Section: Real-time Applicationsmentioning
confidence: 99%
“…Another application that combines textual and visual techniques has been proposed by Quack et al [20]. They developed a system that crawls photos on the internet and identifies clusters of images referring to a common object (physical items on fixed locations), and events (special social occasions taking place at certain times).…”
Section: Combined Analysis Of Geographical Context and Visual Contentmentioning
confidence: 99%
“…Gammeter et al [9] extends this idea towards object-based auto-annotation of holiday photos in a large database that includes landmark buildings, statues, scenes, pieces of art, with help of external resources such as Wikipedia. In both [20] and [9], GPS coordinates are used to pre-cluster objects which may not be always available.…”
Section: Combined Analysis Of Geographical Context and Visual Contentmentioning
confidence: 99%
“…But, though they are still developing, vision based methods are quite powerful and when combined with textual methods, very effective automated systems can be achieved. A good application of combined use of textual and visual techniques is proposed by Quack et al in [11]. Objective of the work persented in [11] is to provide a system that automatically forms high quality image databases using the large-scale internet sources.…”
Section: Related Workmentioning
confidence: 99%