No abstract
Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or cities/neighborhoods. First, we conduct empirical analysis to uncover venue and user characteristics for improving geolocation. For venues, we observe spatial homophily , in which venues near each other have more similar tweet content (i.e., text representations) compared to venues further apart. For users, we observe that they are spatially focused and more likely to visit venues near their previous visits. We also find that a substantial proportion of users post one or more geocoded tweet(s), thus providing their location history data. We then propose geolocation models that exploit spatial homophily and spatial focus characteristics plus posting time information. Our models rank candidate venues of test tweets such that the actual posting venue is ranked high. To better tune model parameters, we introduce a learning-to-rank framework. Our best model significantly outperforms state-of-the-art baselines. Furthermore, we show that tweets without any location-indicative words can be geolocated meaningfully as well.
In fine-grained tweet geolocation, tweets are linked to the specific venues (e.g., restaurants, shops) from which they were posted. This explicitly recovers the venue context that is essential for applications such as location-based advertising or user profiling. For this geolocation task, we focus on geolocating tweets that are contained in tweet sequences. In a tweet sequence, tweets are posted from some latent venue(s) by the same user and within a short time interval. This scenario arises from two observations: (1) It is quite common that users post multiple tweets in a short time and (2) most tweets are not geocoded. To more accurately geolocate a tweet, we propose a model that performs query expansion on the tweet (query) using two novel approaches. The first approach temporal query expansion considers users’ staying behavior around venues. The second approach visitation query expansion leverages on user revisiting the same or similar venues in the past. We combine both query expansion approaches via a novel fusion framework and overlay them on a Hidden Markov Model to account for sequential information. In our comprehensive experiments across multiple datasets and metrics, we show our proposed model to be more robust and accurate than other baselines.
Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visualtext slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions. CCS CONCEPTS• Computing methodologies → Natural language processing; Computer vision representations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.