Multimedia multimodal geocoding

Li, Lin Tzy; Pedronette, Daniel Carlos Guimarães; Almeida, Jurandy; Penatti, Otávio Augusto Bizetto; Calumby, Rodrigo Tripodi; Torres, Ricardo da Silva

doi:10.1145/2424321.2424393

Cited by 9 publications

(10 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Figure 5 depicts the median geotagging error (relative to the number of tags) of run1, run4 and two configurations of the approach that use the full YFCC100M dataset, one combining only the language model with feature selection and the second using all of the proposed refinements. The combination of all proposed refinements appears to result in the best geotagging accuracy in almost all tag ranges, except the [6,10] range where the base language model slightly outperforms the rest. Another noteworthy fact is that using the proposed improvements on the reduced training set (5M), i.e.…”

Section: Further Performance Analysismentioning

confidence: 98%

“…The second one is Hiemstra's Language Model (HLM) with re-ranking, which combined the Terrier 5 Information Retrieval (IR) engine with the HLM weighting model. In [10], Li et al applied a combination of textual, visual and audio analysis in order to geocode the given image/videos. Further, they re-ranked items using the RL-Sim algorithm and predicted the location of the images by clustering the top-rated results.…”

Section: The Mediaeval 2014 Placing Taskmentioning

confidence: 99%

See 1 more Smart Citation

Geotagging Social Media Content with a Refined Language Modelling Approach

Kordopatis-Zilos

Papadopoulos

Kompatsiaris

2015

Intelligence and Security Informatics

View full text Add to dashboard Cite

Abstract. The problem of content geotagging, i.e. estimating the geographic position of a piece of content (text message, tagged image, etc.) when this is not explicitly available, has attracted increasing interest as the large volumes of user-generated content posted through social media platforms such as Twitter and Instagram form nowadays a key element in the coverage of news stories and events. In particular, in large-scale incidents, where location is an important factor, such as natural disasters and terrorist attacks, a large number of people around the globe post comments and content to social media. Yet, the large majority of content lacks proper geographic information (in the form of latitude and longitude coordinates) and hence cannot be utilized to the full extent (e.g., by viewing citizens reports on a map). To this end, we present a new geotagging approach that can estimate the location of a post based on its text using refined language models that are learned from massive corpora of social media content. Using a large benchmark collection, we demonstrate the improvements in geotagging accuracy as a result of the proposed refinements.

show abstract

Section: Further Performance Analysismentioning

confidence: 98%

Section: The Mediaeval 2014 Placing Taskmentioning

confidence: 99%

Geotagging Social Media Content with a Refined Language Modelling Approach

Kordopatis-Zilos

Papadopoulos

Kompatsiaris

2015

Intelligence and Security Informatics

View full text Add to dashboard Cite

show abstract

“…Estimating the geo-location of a video can be achieved based on visual features [17,14], textual information [18], social context [21], and their combinations [12,23,20]. Considering that some textual and social metadata used in multimodal frameworks are not always available in real life, visual approaches for video geocoding are important because they infer the geo-coordinates of a video using the visual content only.…”

Section: Introductionmentioning

confidence: 99%

“…In the most recent visual approaches for video geocoding, the Bag-of-Scenes (BoS) model has been shown to be simple and effective [17,12]. This approach first generates a dictionary of scenes, each of which represents a specific semantic concept.…”

Section: Introductionmentioning

confidence: 99%

Exploiting Spatial Relationship between Scenes for Hierarchical Video Geotagging

Yin

Zhang

Zimmermann

2015

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

View full text Add to dashboard Cite

Predicting the location of a video based on its content is a very meaningful, yet very challenging problem. Most existing work has focused on developing representative visual features and then searching for visually nearest neighbors in the development set to achieve a prediction. Interestingly, the relationship between scenes has been overlooked in prior work. Two scenes that are visually different, but frequently co-occur in same location, should naturally be considered similar for the geotagging problem. To build upon the above ideas, we propose to model the geo-spatial distributions of scenes by Gaussian Mixture Models (GMMs) and measure the distribution similarity by the Jensen-Shannon divergence (JSD). Subsequently, we present the Spatial Relationship Model (SRM) for geotagging which integrates the geo-spatial relationship of scenes into a hierarchical framework. We segment the Earth's surface into multiple levels of grids and measure the likelihood of input videos with an adaptation to region granularities. We have evaluated our approach using the YFCC100M dataset in the context of the MediaEval 2014 placing task. The total set of 35,000 geotagged videos is further divided into a training set of 25,000 videos and a test set of 10,000 videos. Our experimental results demonstrate the effectiveness of our proposed framework, as our solution achieves good accuracy and outperforms existing visual approaches for video geotagging.

show abstract

“…Park et al [11] aim to find the heading of a query image by utilizing Google Street View TM . Li et al [9] and Gallagher et al [6] propose a multimodal geo-tagging system that exploits visual and textual descriptions associated with videos and photos, and show that this multimodal approach yields better results than those of a single modal approach. However, these methods suffer from non-uniform geographical spatial prior; crowd-sourced georeferenced images tend to be concentrated in urban areas, at popular land-marks, or around frequently-traveled places.…”

Section: Introductionmentioning

confidence: 99%

Tag configuration matcher for geo-tagging

Park

Chen

Shafique

2013

Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems

View full text Add to dashboard Cite

It is common today for even consumer-grade cameras to tag images and videos with the location of the imagery on the earth's surface. Some imagery, however, does not have a geolocation tag and it thus becomes necessary to ascertain the location of the camera, image, or objects in the scene. For such imagery, users must work hard to deduce geo-locations using reference data. Geo-tagging of such image/video is an extremely time-consuming and labor-intensive activity that often meets with limited success. In this paper, we propose a system to estimate the geo-location and viewing direction of a query photo using geometric configuration of objects in the query. Our experiment using a set of ground-truth data within our proposed system shows promising results.

show abstract

Multimedia multimodal geocoding

Cited by 9 publications

References 17 publications

Geotagging Social Media Content with a Refined Language Modelling Approach

Geotagging Social Media Content with a Refined Language Modelling Approach

Exploiting Spatial Relationship between Scenes for Hierarchical Video Geotagging

Tag configuration matcher for geo-tagging

Contact Info

Product

Resources

About