The explosive growth in social networks that publish real-time content begs the question of whether their feeds can complement traditional sensors to achieve augmented sensing capabilities. One such capability is to explain anomalous sensor readings. In our previous conference paper, we built an automated anomaly clarification service, called ClariSense, with the ability to explain sensor anomalies using social network feeds (from Twitter). In this extended work, we present an enhanced anomaly explanation system that augments our base algorithm by considering both (i) the credibility of social feeds and (ii) the spatial locality of detected anomalies. The work is geared specifically for describing small-footprint anomalies, such as vehicular traffic accidents. The original system used information gain to select more informative microblog items to explain physical sensor anomalies. In this paper, we show that significant improvements are achieved in our ability to explain small-footprint anomalies by accounting for information credibility and further discriminating among highinformation-gain items according to the size of their spatial footprint. Hence, items that lack sufficient corroboration and items whose spatial footprint in the blogosphere is not specific to the approximate location of the physical anomaly receive less consideration. We briefly demonstrate the workings of such a system by considering a variety of realworld anomalous events, and comparing their causes, as identified by ClariSense+, to ground truth for validation. A more systematic evaluation of this work is done using vehicular traffic anomalies. Specifically, we consider real-time traffic flow feeds shared by the California traffic system. When flow anomalies are detected, our system automatically diagnoses their root cause by correlating the anomaly with feeds on Twitter. For evaluation purposes, the identified cause is then retroactively compared to official traffic and incident reports that we take as ground truth. Results show a great correspondence between our automatically selected explanations and ground-truth data.
Abstract-This paper develops an algorithm that exploits picture-oriented social networks to localize urban events. We choose picture-oriented networks because taking a picture requires physical proximity, thereby revealing the location of the photographed event. Furthermore, most modern cell phones are equipped with GPS, making picture location, and time metadata commonly available. We consider Instagram as the social network of choice and limit ourselves to urban events (noting that the majority of the world population lives in cities). The paper introduces a new adaptive localization algorithm that does not require the user to specify manually tunable parameters. We evaluate the performance of our algorithm for various real-world datasets, comparing it against a few baseline methods. The results show that our method achieves the best recall, the fewest false positives, and the lowest average error in localizing urban events.
Abstract-This paper develops an algorithm to identify and geo-locate real world events that may be present as social activity signals in two different social networks. Specifically, we focus on content shared by users on Twitter and Instagram in order to design a system capable of fusing data across multiple networks. Past work has demonstrated that it is indeed possible to detect physical events using various social network platforms. However, many of these signals need corroboration in order to handle events that lack proper support within a single network. We leverage this insight to design an unsupervised approach that can correlate event signals across multiple social networks. location of the event occurrence. We evaluate our algorithm using both simulations and real world datasets collected using Twitter and Instagram. The results indicate that our algorithm siginificantly improves false positive elimination and attains high precision compared to baseline methods on real world datasets.
The widespread use of road sensors has generated huge amount of traffic data, which can be mined and put to various different uses. Finding frequent trajectories from the road network of a big city helps in summarizing the way the traffic behaves in the city. It can be very useful in city planning and traffic routing mechanisms, and may be used to suggest the best routes given the region, road, time of day, day of week, season, weather, and events etc. Other than the frequent patterns, even the events that are not so frequent, such as those observed when there is heavy snowfall, other extreme weather conditions, long traffic jams, accidents, etc. might actually follow a periodic occurrence, and hence might be useful to mine. This problem of mining the frequent patterns from road traffic data has been addressed in previous works using the context knowledge of the road network of the city. In this paper, we have developed a method to mine spatiotemporal periodic patterns in the traffic data and use these periodic behaviors to summarize the huge road network. The first step is to find periodic patterns from the speed data of individual road sensor stations, and use their periods to represent the station's periodic behavior using probability distribution matrices. Then, we use density-based clustering to cluster the sensors on the road network based on the similarities between their periodic behavior as well as their geographical distance, thus combining similar nodes to form a road network with larger but fewer nodes.
In many of today's online applications that facilitate data exploration, results from information filters such as recommender systems are displayed alongside traditional search tools. However, the effect of prediction algorithms on users who are performing open-ended data exploration tasks through a search interface is not well understood. This paper describes a study of three interface variations of a tool for analyzing commuter traffic anomalies in the San Francisco Bay Area. The system supports novel interaction between a prediction algorithm and a human analyst, and is designed to explore the boundaries, limitations and synergies of both. The degree of explanation of underlying data and algorithmic process was varied experimentally across each interface. The experiment (N=197) was performed to assess the impact of algorithm transparency/explanation on data analysis tasks in terms of search success, general insight into the underlying data set and user experience. Results show that 1) presence of recommendations in the user interface produced a significant improvement in recall of anomalies, 2) participants were able to detect anomalies in the data that were missed by the algorithm, 3) participants who used the prediction algorithm performed significantly better when estimating quantities in the data, and 4) participants in the most explanatory condition were the least biased by the algorithm's predictions when estimating quantities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.