Abstract-Big data storage technologies inherently entail high latency characteristics, preventing users from performing efficient ad-hoc querying and interactive visualization on large and distributed datasets. Most of the existing approaches addressing this issue thrive on de-normalization of the static data schema and creation of application specific (i.e. hardcoded) materialized views, which certainly reduce data access latency but at the expense of flexibility. In this regard, this paper proposes an approach that relies on an iterative process of data transformation intended to generate read-optimized data schemas. The transformation process is able to automatically identify optimization opportunities (e.g. materialized views, missing indexes), by analyzing the original data schema and the record of queries issued by users and client applications against the data set. An experimental evaluation of the proposed approach evidences a significant reduction in the query latency, ranging from 81.60% to 99.99%.
The information explosion the world has witnessed in the last two decades has forced businesses to adopt a data-driven culture for them to be competitive. These data-driven businesses have access to countless sources of information, and face the challenge of making sense of overwhelming amounts of data in a efficient and reliable manner, which implies the execution of readintensive operations. In the context of this challenge, a framework for the dynamic read-optimization of large dimensional datasets has been designed, and on top of it a workload-driven mechanism for automatic materialized view selection and creation has been developed. This paper presents an extensive description of this mechanism, along with a proof-of-concept implementation of it and its corresponding performance evaluation. Results show that the proposed mechanism is able to derive a limited but comprehensive set of views leading to a drop in query latency ranging from 80% to 99.99% at the expense of 13% of the disk space used by the base dataset. This way, the devised mechanism enables speeding up query execution by building materialized views that match the actual demand of query workloads.
Citizen engagement is one of the key factors for smart city initiatives to remain sustainable over time. This in turn entails providing citizens and other relevant stakeholders with the latest data and tools that enable them to derive insights that add value to their day-to-day life. The massive volume of data being constantly produced in these smart city environments makes satisfying this requirement particularly challenging. This paper introduces Explora, a generic framework for serving interactive low-latency requests, typical of visual exploratory applications on spatiotemporal data, which leverages the stream processing for deriving—on ingestion time—synopsis data structures that concisely capture the spatial and temporal trends and dynamics of the sensed variables and serve as compacted data sets to provide fast (approximate) answers to visual queries on smart city data. The experimental evaluation conducted on proof-of-concept implementations of Explora, based on traditional database and distributed data processing setups, accounts for a decrease of up to 2 orders of magnitude in query latency compared to queries running on the base raw data at the expense of less than 10% query accuracy and 30% data footprint. The implementation of the framework on real smart city data along with the obtained experimental results prove the feasibility of the proposed approach.
Abstract-Big Data technologies have traditionally operated in an offline setting, collecting large batches of information on clusters of commodity machines and performing complex and time-consuming computations over it. While frameworks following this approach served well for most applications involving big data analysis during the last decade, other use cases have recently emerged posing challenging requirements on latency and demanding real-time data processing, querying and visualization. That is the case for applications aiming at detecting threatening behaviors in social network platforms, where timely action is required to avoid adverse consequences. In this sense, more and more attention has been drawn towards online data processing systems claiming to address the limitations of batch-oriented frameworks. This paper reports a work in progress on distributed data processing for enabling low-latency querying over big data sets. Two software architectures are discussed for addressing the problem and an experimental evaluation is performed on a proof of concept implementation showing how an approach based on query pre-processing and stateful distributed stream computation can meet the requirements for supporting interactive querying on large and continuously generated data.
Abstract-The current evolution of the Web, known as Web 2.0 and characterized by providing a diverse global service ecosystem, has marked a change in the role played by telecom operators. In order to maintain high competitive market dynamism and generate new revenue sources, many operators seek to leverage the wide variety of existing Web services and integrate them with its infrastructure capabilities. Such integration leads to various challenges from a technological perspective, where the heterogeneity on networks and the need for highly qualified personnel for the development of these services are highlighted. This paper is propose the definition of a mechanism for integrating both Web and Telco services which facilitates and speeds the development of new services, considering the dynamic conditions of its execution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.