The Internet of Things (IoT) generates massive streams of data which call for ever more efficient real time processing. Designing and implementing a big data service for the real time processing of such data requires an extensive knowledge of both input load and data distribution in order to provide a service which can cope with the workload. In this context, we study in this paper the challenges inherent to the real time processing of massive data flows from the IoT. We provide a detailed analysis of traces gathered from a well-known healthcare sport-oriented application in order to illustrate our conclusions from a big data perspective.
The proliferation of GPS-enabled devices leads to the massive generation of geotagged data sets recently known as Big Location Data. It allows users to explore and analyse data in space and time, and requires an architecture that scales with the insertions and location-temporal queries workload from thousands to millions of users. Most large scale key-value data storage solutions only provide a single one-dimensional index which does not natively support efficient multidimensional queries. In this paper, we propose GeoTrie, a scalable architecture built by coalescing any number of machines organized on top of a Distributed Hash Table. The key idea of our approach is to provide a distributed global index which scales with the number of nodes and provides natural load balancing for insertions and location-temporal range queries. We assess our solution using the largest public multimedia data set released by Yahoo! which includes millions of geotagged multimedia files.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.