Abstract-There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of time series. Examples of such applications come from astronomy, biology, the web, and other domains. It is not unusual for these applications to involve numbers of time series in the order of hundreds of millions to billions. However, all relevant techniques that have been proposed in the literature so far have not considered any data collections much larger than onemillion time series. In this paper, we describe iSAX 2.0, a data structure designed for indexing and mining truly massive collections of time series. We show that the main bottleneck in mining such massive datasets is the time taken to build the index, and we thus introduce a novel bulk loading mechanism, the first of this kind specifically tailored to a time series index. We show how our method allows mining on datasets that would otherwise be completely untenable, including the first published experiments to index one billion time series, and experiments in mining massive data from domains as diverse as entomology, DNA and web-scale image collections.
Abstract-Data prediction is proposed in wireless sensor networks (WSNs) to extend the system lifetime by enabling the sink to determine the data sampled, within some accuracy bounds, with only minimal communication from source nodes. Several theoretical studies clearly demonstrate the tremendous potential of this approach, able to suppress the vast majority of data reports at the source nodes. Nevertheless, the techniques employed are relatively complex, and their feasibility on resource-scarce WSN devices is often not ascertained. More generally, the literature lacks reports from real-world deployments, quantifying the overall system-wide lifetime improvements determined by the interplay of data prediction with the underlying network. These two aspects, feasibility and system-wide gains, are key in determining the practical usefulness of data prediction in real-world WSN applications. In this paper, we describe Derivative-Based Prediction (DBP), a novel data prediction technique much simpler than those found in the literature. Evaluation with real data sets from diverse WSN deployments shows that DBP often performs better than the competition, with data suppression rates up to 99% and good prediction accuracy. However, experiments with a real WSN in a road tunnel show that, when the network stack is taken into consideration, DBP only triples lifetime-a remarkable result per se, but a far cry from the data suppression rates above. To fully achieve the energy savings enabled by data prediction, the data and network layers must be jointly optimized. In our testbed experiments, a simple tuning of the MAC and routing stack, taking into account the operation of DBP, yields a remarkable seven-fold lifetime improvement w.r.t. the mainstream periodic reporting.
Abstract-Model-driven data acquisition techniques aim at reducing the amount of data reported, and therefore the energy consumed, in wireless sensor networks (WSNs). At each node, a model predicts the sampled data; when the latter deviate from the current model, a new model is generated and sent to the data sink. However, experiences in real-world deployments have not been reported in the literature. Evaluation typically focuses solely on the quantity of data reports suppressed at source nodes: the interplay between data modeling and the underlying network protocols is not analyzed.In contrast, this paper investigates in practice whether i) model-driven data acquisition works in a real application; ii) the energy savings it enables in theory are still worthwhile once the network stack is taken into account. We do so in the concrete setting of a WSN-based system for adaptive lighting in road tunnels. Our novel modeling technique, Derivative-Based Prediction (DBP), suppresses up to 99% of the data reports, while meeting the error tolerance of our application. DBP is considerably simpler than competing techniques, yet performs better in our real setting. Experiments in both an indoor testbed and an operational road tunnel show also that, once the network stack is taken into consideration, DBP triples the WSN lifetime-a remarkable result per se, but a far cry from the aforementioned 99% data suppression. This suggests that, to fully exploit the energy savings enabled by data modeling techniques, a coordinated operation of the data and network layers is necessary.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.