HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de
International audienceMicroblogs such as Twitter are characterized by the richness and recency of information shared by their users during major events. However, it is very challenging to automatically mine for information or for users sharing certain information due to the huge variety of unstructured stream of data shared in such microblogs. This work proposes a ranking and classification model for identifying users sharing useful information during a specified event. The model is based on a novel set of features that can be computed in real time. These features are designed such that they take into account both the on and off-topic activities of a user. Once users are characterized by a feature vector, supervised machine learning tool is trained to classify users as either prominent or not. Our model has been tested on data shared during a flooding disaster event and performed very well. The achieved results show the effectiveness of the proposed model for both the classification and ranking of prominent users in such events, and also the importance of the adjustment of the on-topic features by the off-topic ones when describing users' activities
During crisis events such as disasters, the need for real-time information retrieval (IR) from microblogs becomes essential. However, the huge amount and the variety of the shared information in real time during such events over-complicates this task. Unlike existing IR approaches based on content analysis, we propose to tackle this problem by using user-centric IR approaches with identifying and tracking prominent microblog users who are susceptible to share relevant and exclusive information at an early stage of each analyzed event phase. This approach ensures real-time access to the valuable microblogs information required by the emergency teams. In this approach, we propose a phase-aware probabilistic model for predicting and ranking prominent microblog users over time according to their behavior using Mixture of Gaussians Hidden Markov Models (MoG-HMM). The model utilizes a new user representation which takes into account both the user and the event specificities over time. This user representation comprises the following new aspects (1) Modeling microblog users behavior evolution by considering the different event phases (2) Characterizing users activity over time through a temporal sequence representation (3) Timeseries-based selection of the most discriminative features (4) prominent users prediction using probabilistic phase-aware models learned a priori. We have conducted experiments during flooding events: we trained our identification models using a dataset relative to the "Alpes-Maritimes floods" and we tested its identification performance using a new dataset relative to another flooding disaster "Herault floods". The achieved results show that our model significantly outperforms phase-unaware models and identifies most of the prominent users at an early stage of each event phase.
Abstract. Microblogs have proved their potential to attract people from all over the world to express voluntarily what is happening around them during unexpected events. However, retrieving relevant information from the huge amount of data shared in real time in these microblogs remain complex. This paper proposes a new system named MASIR for real-time information retrieval from microblogs during unexpected events. MASIR is based on a decentralized and collaborative multi-agent approach analyzing the profiles of users interested in a given event in order to detect the most prominent ones that have to be tracked in real time. Real time monitoring of these users enables a direct access to valuable fresh information. Our experiments shows that MASIR simplifies the real-time detection and tracking of the most prominent users by exploring both the old and fresh information shared during the event and outperforms the standard centrality measures by using a time-sensitive ranking model.
Abstract. The response phase in a disaster case is often considered to be the most critical in terms of saving lives and dealing with irreversible damage. The timely provision of geospatial information is crucial in the decision-making process. Thus, there is a need for the integration of heterogeneous spatial databases which are inherently distributed and created under different projects by various organizations. The integration of all relevant data for timely decision making is a challenging task due to syntactic, schematic and semantic heterogeneity. This paper aims to propose a framework for the integration of heterogeneous spatial databases using novel approaches, such as web services and ontologies. We focus on providing solutions for the three levels of heterogeneity, in order to be able to interrogate the content of the different databases conveniently. Based on the proposed framework, we implemented a use case using heterogeneous data belonging to La Rochelle city in France.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.