Advances in pervasive computing and sensor technologies have paved the way for the explosive living ubiquity of geo-physical data streams. The management of the massive and unbounded streams of sensor data produced poses several challenges, including the real-time application of summarization techniques, which should allow the storage and query of this amount of georeferenced and timestamped data in a server with limited memory. In order to face this issue, we have designed a summarization technique, called SUMATRA, which segments the stream into windows, computes summaries window-by-window and stores these summaries in a database. Trend clusters are discovered as summaries of each window. They are clusters of georeferenced data which vary according to a similar trend along the window time horizon. Several compression techniques are also investigated to derive a compact, but accurate representation of these trends for storage in the database. A learning strategy to automatically choose the best trend compression technique is designed. Finally, an in-network modality for tree-based trend cluster discovery is investigated in order to achieve an efficacious aggregation schema which drastically reduces the number of bytes transmitted across the network and maintains a longer network lifespan. This schema is mapped onto the routing structure of a tree-based WSN topology. Experiments performed with several data streams of real sensor networks assess the summarization capability, the accuracy and the efficiency of the proposed summarization schema
This article documents the addition of 153 microsatellite marker loci to the Molecular Ecology Resources Database. Loci were developed for the following species: Brassica oleracea, Brycon amazonicus, Dimorphandra wilsonii, Eupallasella percnurus, Helleborus foetidus, Ipomoea purpurea, Phrynops geoffroanus, Prochilodus argenteus, Pyura sp., Sylvia atricapilla, Teratosphaeria suttonii, Trialeurodes vaporariorum and Trypanosoma brucei. These loci were cross‐tested on the following species: Dimorphandra coccicinea, Dimorphandra cuprea, Dimorphandra gardneriana, Dimorphandra jorgei, Dimorphandra macrostachya, Dimorphandra mollis, Dimorphandra parviflora and Dimorphandra pennigera.
Emerging real life applications, such as environmental compliance, ecological studies and meteorology, are characterized by real-time data acquisition through remote sensor networks. The most important aspect of the sensor readings is that they comprise a space dimension and a time dimension which are both information bearing. Additionally, they usually arrive at a rapid rate in a continuous, unbounded stream. Streaming prevents us from storing all readings and performing multiple scans of the entire data set. The drift of data distribution poses the additional problem of mining patterns which may change over the time. We address these challenges for the trend cluster cluster discovery, that is, the discovery of clusters of spatially close sensors which transmit readings, whose temporal variation, called trend polyline, is similar along the time horizon of a window. We present a stream framework which segments the stream into equally-sized windows, computes online intra-window trend clusters and stores these trend clusters in a database. Trend clusters are queried offline at any time, to determine trend clusters along larger windows (i.e. windows of windows). Experiments with several streams demonstrate the effectiveness of the proposed framework in discovering accurate and relevant to human trend cluster
Ubiquitous sensor stations continuously measure several geophysical fields over large zones and long (potentially unbounded) periods of time. However, observations can never cover every location nor every time. In addition, due to its huge volume, the data produced cannot be entirely recorded for future analysis. In this scenario, interpolation, i.e., the estimation of unknown data in each location or time of interest, can be used to supplement station records. Although in GIScience there has been a tendency to treat space and time separately, integrating space and time could yield better results than treating them separately when interpolating geophysical fields. According to this idea, a spatiotemporal interpolation process, which accounts for both space and time, is described here. It operates in two phases. First, the exploration phase addresses the problem of interaction. This phase is performed on-line using data recorded from a network throughout a time window. The trend cluster discovery process determines prominent data trends and geographically-aware station interactions in the window. The result of this process is given before a new data window is recorded. Second, the estimation phase uses the inverse distance weighting approach both to approximate observed data and to estimate missing data. The proposed technique has been evaluated using two large real climate sensor networks. The experiments empirically demonstrate that, in spite of a notable reduction in the volume of data, the technique guarantees accurate estimation of missing data
Abstract. Some challenges in frequent pattern mining from data streams are the drift of data distribution and the computational efficiency. In this work an additional challenge is considered: data streams describe complex objects modeled by multiple database relations. A multi-relational data mining algorithm is proposed to efficiently discover approximate relational frequent patterns over a sliding time window of a complex data stream. The effectiveness of the method is proved on application to the Internet packet stream.
No abstract
We consider distributed computing environments where geo-referenced sensors feed a unique central server with numeric and uni-dimensional data streams. Knowledge discovery from these geographically distributed data streams poses several challenges including the requirement of data summarization in order to store the streamed data in a central server with a limited memory. We propose an enhanced segmentation algorithm in order to group data sources in the same spatial cluster if they stream data which evolve according to a close trajectory over the time. A trajectory is constructed by tracking only data points which represent a change of trend in the associated spatial cluster. Clusters of trajectories are discovered on-the-fly and stored in the database. Experiments prove effectiveness and accuracy of our approach
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.