Abstract-In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research.
Abstract-The Internet of Things (IoT) enables connected objects to capture, communicate, and collect information over the network through a multitude of sensors, setting the foundation for applications such as smart grids, smart cars, and smart cities. In this context, large scale analytics is needed to extract knowledge and value from the data produced by these sensors. The ability to perform analytics on these data, however, is highly limited by the difficulties of collecting labels. Indeed, the machine learning techniques used to perform analytics rely upon data labels to learn and to validate results. Historically, crowdsourcing platforms have been used to gather labels, yet they cannot be directly used in the IoT because of poor human readability of sensor data. To overcome these limitations, this paper proposes a framework for sensor data analytics which leverages the power of crowdsourcing through gamification to acquire sensor data labels. The framework uses gamification as a socially engaging vehicle and as a way to motivate users to participate in various labelling tasks. To demonstrate the framework proposed, a case study is also presented. Evaluation results show the framework can successfully translate gamification events into sensor data labels.
Amongst energy-related CO2 emissions, electricity is the largest single contributor, and with the proliferation of electric vehicles and other developments, energy use is expected to increase. Load forecasting is essential for combating these issues as it balances demand and production and contributes to energy management. Current state-of-the-art solutions such as recurrent neural networks (RNNs) and sequence-to-sequence algorithms (Seq2Seq) are highly accurate, but most studies examine them on a single data stream. On the other hand, in natural language processing (NLP), transformer architecture has become the dominant technique, outperforming RNN and Seq2Seq algorithms while also allowing parallelization. Consequently, this paper proposes a transformer-based architecture for load forecasting by modifying the NLP transformer workflow, adding N-space transformation, and designing a novel technique for handling contextual features. Moreover, in contrast to most load forecasting studies, we evaluate the proposed solution on different data streams under various forecasting horizons and input window lengths in order to ensure result reproducibility. Results show that the proposed approach successfully handles time series with contextual data and outperforms the state-of-the-art Seq2Seq models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.