In this article we present GeoTxt, a scalable geoparsing system for the recognition and geolocation of place names in unstructured text. GeoTxt offers six named entity recognition (NER) algorithms for place name recognition, and utilizes an enterprise search engine for the indexing, ranking, and retrieval of toponyms, enabling scalable geoparsing for streaming text. GeoTxt offers a flexible application programming interface (API), allowing for customized attribute and/or spatial ranking of retrieved toponyms. We evaluate the system on a corpus of manually geo-annotated tweets.First, we benchmark the performance of the six NERs that GeoTxt provides access to. Second, we assess GeoTxt toponym resolution accuracy incrementally, demonstrating improvements in toponym resolution achieved (or not achieved) by adding specific heuristics and disambiguation methods. Compared to using the GeoNames web service, GeoTxt's toponym resolution demonstrates a 20% accuracy gain. Our results show that places mentioned in the same tweet do not tend to be geographically proximate.
InterfaceAutomated Label Verify Label Fig. 1. Our interactive learning framework allows users to train text relevance classifiers in real-time to improve situational awareness.In this example, a real-time tweet regarding a car accident is incorrectly classified as "Irrelevant". Through the SMART 2.0 interface, the user can view its label and correct it to "Relevant", thereby retraining and improving the classifier for incoming streaming data.Abstract-Various domain users are increasingly leveraging real-time social media data to gain rapid situational awareness. However, due to the high noise in the deluge of data, effectively determining semantically relevant information can be difficult, further complicated by the changing definition of relevancy by each end user for different events. The majority of existing methods for short text relevance classification fail to incorporate users' knowledge into the classification process. Existing methods that incorporate interactive user feedback focus on historical datasets. Therefore, classifiers cannot be interactively retrained for specific events or user-dependent needs in real-time. This limits real-time situational awareness, as streaming data that is incorrectly classified cannot be corrected immediately, permitting the possibility for important incoming data to be incorrectly classified as well. We present a novel interactive learning framework to improve the classification process in which the user iteratively corrects the relevancy of tweets in real-time to train the classification model on-the-fly for immediate predictive improvements. We computationally evaluate our classification model adapted to learn at interactive rates. Our results show that our approach outperforms state-of-the-art machine learning models. In addition, we integrate our framework with the extended Social Media Analytics and Reporting Toolkit (SMART) 2.0 system, allowing the use of our interactive learning framework within a visual analytics system tailored for real-time situational awareness. To demonstrate our framework's effectiveness, we provide domain expert feedback from first responders who used the extended SMART 2.0 system.Index Terms-Interactive machine learning, human-computer interaction, social media analytics, emergency/disaster management, situational awareness
Feature selection is used in machine learning to improve predictions, decrease computation time, reduce noise, and tune models based on limited sample data. In this article, we present FeatureExplorer, a visual analytics system that supports the dynamic evaluation of regression models and importance of feature subsets through the interactive selection of features in high-dimensional feature spaces typical of hyperspectral images. The interactive system allows users to iteratively refine and diagnose the model by selecting features based on their domain knowledge, interchangeable (correlated) features, feature importance, and the resulting model performance.
Recently telecommunication industry benefits from infrastructure sharing, one of the most fundamental enablers of cloud computing, leading to emergence of the Mobile Virtual Network Operator (MVNO) concept. The most momentous intents by this approach are the support of on-demand provisioning and elasticity of virtualized mobile network components, based on data traffic load. To realize it, during operation and management procedures, the virtualized services need be triggered in order to scale-up/down or scale-out/in an service instance. In this paper we propose an architecture called MOBaaS (Mobility and Bandwidth Availability Prediction as a Service), comprising two algorithms in order to predict user(s) mobility and network link bandwidth availability, that can be implemented in cloud based mobile network structure and can be used as a support service by any other virtualized mobile network service. MOBaaS can provide prediction information in order to generate required triggers for on-demand deploying, provisioning, disposing of virtualized network components. This information can be used for self-adaptation procedures and optimal network function configuration during run-time operation, as well. Through the preliminary experiments with the prototype implementation on the OpenStack platform, we evaluated and confirmed the feasibility and the effectiveness of the prediction algorithms and the proposed architecture
Measurements of human interaction through proxies such as social connectedness or movement patterns have proved useful for predictive modeling of COVID-19, which is a challenging task, especially at high spatial resolutions. In this study, we develop a Spatiotemporal autoregressive model to predict county-level new cases of COVID-19 in the coterminous US using spatiotemporal lags of infection rates, human interactions, human mobility, and socioeconomic composition of counties as predictive features. We capture human interactions through 1) Facebook- and 2) cell phone-derived measures of connectivity and human mobility, and use them in two separate models for predicting county-level new cases of COVID-19. We evaluate the model on 14 forecast dates between 2020/10/25 and 2021/01/24 over one- to four-week prediction horizons. Comparing our predictions with a Baseline model developed by the COVID-19 Forecast Hub indicates an average 6.46% improvement in prediction Mean Absolute Errors (MAE) over the two-week prediction horizon up to 20.22% improvement in the four-week prediction horizon, pointing to the strong predictive power of our model in the longer prediction horizons.
Social media platforms are filled with social spambots. Detecting these malicious accounts is essential, yet challenging, as they continually evolve to evade detection techniques. In this article, we present VASSL, a visual analytics system that assists in the process of detecting and labeling spambots. Our tool enhances the performance and scalability of manual labeling by providing multiple connected views and utilizing dimensionality reduction, sentiment analysis and topic modeling, enabling insights for the identification of spambots. The system allows users to select and analyze groups of accounts in an interactive manner, which enables the detection of spambots that may not be identified when examined individually. We present a user study to objectively evaluate the performance of VASSL users, as well as capturing subjective opinions about the usefulness and the ease of use of the tool.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.