The aim of this study is the computerization of the argument Delphi method. The Delphi method is mainly designed for qualitative prediction within a group of experts, where the experts make predictions and a facilitator controls these predictions until the experts end up with a level of consensus. Argument Delphi, as opposed to the classical Delphi model, is built on the contradictions of the ideas of the experts. Argument Delphi mainly focuses on a discussion topic and asks experts to create new arguments and criticize other arguments from other experts. After a certain level of contradiction, the method yields an amount of contradictory, criticized arguments and builds a decision over these antitheses, as in the Hegelian approach. This is the first time the argument Delphi method has been modeled in a graph of arguments and the problem of qualitative decision has been transferred into a graph problem using Delphi method. This paper is also the first time that argument aggregation and evaluation methods have been proposed. Moreover, the computerized version of argument Delphi is applied to real-world problems using crowd involvement through Facebook. The problem is defined as the prediction of petroleum prices for the end of year and more than 100 contributors from all around the world argued and criticized each other. This paper also discusses the findings of this case study.
INDEX TERMSDelphi technique, decision support systems, qualitative methods, graph theory, social network, crowd opinion.
Collecting observations from all international news coverage and using TABARI software to code events, the Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with 250+ million observations covering all countries in the world from January 1, 1979 to the present with daily updates. The purpose of this widely used dataset is to help understand and uncover spatial, temporal and perceptual trends and behaviors of the social and international system. To query such big geospatial data, traditional RDBMS can no longer be used and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proved to be a scalable platform to process and analyze Big Data in the cloud. Hadoop as an implementation of MapReduce is an open source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and falls short as it doesnt perform efficiently in terms of running time. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Querying Framework (GISQF) to process Massive Spatial Data. This framework has been built on top of the open source SpatialHadoop system which exploits two-layer spatial indexing techniques to speed up query processing. We show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on GDELT dataset with a size of 60 GB. We show the results for three types of queries, Longitude-Latitude Point queries, Circle-Area queries, and Aggregation queries.
Anomaly detection refers to the identification of patterns in a dataset that do not conform to expected patterns. Such non-conformant patterns typically correspond to samples of interest and are assigned to different labels in different domains, such as outliers, anomalies, exceptions, and malware. A daunting challenge is to detect anomalies in rapid voluminous streams of data.This paper presents a novel, generic real-time distributed anomaly detection framework for multi-source stream data. As a case study, we investigate anomaly detection for a multi-source VMware-based cloud data center, which maintains a large number of virtual machines (VMs). This framework continuously monitors VMware performance stream data related to CPU statistics (e.g., load and usage). It collects data simultaneously from all of the VMs connected to the network and notifies the resource manager to reschedule its CPU resources dynamically when it identifies any abnormal behavior from its collected data. A semi-supervised clustering technique is used to build a model from benign training data only. During testing, if a data instance deviates significantly from the model, then it is flagged as an anomaly.Effective anomaly detection in this case demands a distributed framework with high throughput and low latency. Distributed streaming frameworks like Apache Storm, Apache Spark, S4, and others are designed for a lower data processing time and a higher throughput than standard centralized frameworks. We have experimentally compared the average processing latency of a tuple during clustering and prediction in both Spark and Storm and demonstrated that Spark processes a tuple much quicker than storm on average.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.