Anna Ciampi scite author profile

Advances in pervasive computing and sensor technologies have paved the way for the explosive living ubiquity of geo-physical data streams. The management of the massive and unbounded streams of sensor data produced poses several challenges, including the real-time application of summarization techniques, which should allow the storage and query of this amount of georeferenced and timestamped data in a server with limited memory. In order to face this issue, we have designed a summarization technique, called SUMATRA, which segments the stream into windows, computes summaries window-by-window and stores these summaries in a database. Trend clusters are discovered as summaries of each window. They are clusters of georeferenced data which vary according to a similar trend along the window time horizon. Several compression techniques are also investigated to derive a compact, but accurate representation of these trends for storage in the database. A learning strategy to automatically choose the best trend compression technique is designed. Finally, an in-network modality for tree-based trend cluster discovery is investigated in order to achieve an efficacious aggregation schema which drastically reduces the number of bytes transmitted across the network and maintains a longer network lifespan. This schema is mapped onto the routing structure of a tree-based WSN topology. Experiments performed with several data streams of real sensor networks assess the summarization capability, the accuracy and the efficiency of the proposed summarization schema

show abstract

Permanent Genetic Resources added to Molecular Ecology Resources Database 1 October 2012–30 November 2012

Aksoy¹,

Almeida-Val²,

Azevedo³

et al. 2013

Molecular Ecology Resources

View full text Add to dashboard Cite

This article documents the addition of 153 microsatellite marker loci to the Molecular Ecology Resources Database. Loci were developed for the following species: Brassica oleracea, Brycon amazonicus, Dimorphandra wilsonii, Eupallasella percnurus, Helleborus foetidus, Ipomoea purpurea, Phrynops geoffroanus, Prochilodus argenteus, Pyura sp., Sylvia atricapilla, Teratosphaeria suttonii, Trialeurodes vaporariorum and Trypanosoma brucei. These loci were cross‐tested on the following species: Dimorphandra coccicinea, Dimorphandra cuprea, Dimorphandra gardneriana, Dimorphandra jorgei, Dimorphandra macrostachya, Dimorphandra mollis, Dimorphandra parviflora and Dimorphandra pennigera.

show abstract

Dealing with temporal and spatial correlations to classify outliers in geophysical data streams

Appice

Guccione

Malerba

et al. 2014

Information Sciences

View full text Add to dashboard Cite

Online and Offline Trend Cluster Discovery in Spatially Distributed Data Streams

Ciampi

Appice

Malerba

2011

View full text Add to dashboard Cite

Emerging real life applications, such as environmental compliance, ecological studies and meteorology, are characterized by real-time data acquisition through remote sensor networks. The most important aspect of the sensor readings is that they comprise a space dimension and a time dimension which are both information bearing. Additionally, they usually arrive at a rapid rate in a continuous, unbounded stream. Streaming prevents us from storing all readings and performing multiple scans of the entire data set. The drift of data distribution poses the additional problem of mining patterns which may change over the time. We address these challenges for the trend cluster cluster discovery, that is, the discovery of clusters of spatially close sensors which transmit readings, whose temporal variation, called trend polyline, is similar along the time horizon of a window. We present a stream framework which segments the stream into equally-sized windows, computes online intra-window trend clusters and stores these trend clusters in a database. Trend clusters are queried offline at any time, to determine trend clusters along larger windows (i.e. windows of windows). Experiments with several streams demonstrate the effectiveness of the proposed framework in discovering accurate and relevant to human trend cluster

show abstract

Using trend clusters for spatiotemporal interpolation of missing data in a sensor network

Appice¹,

Ciampi²,

Malerba³

et al. 2013

JOSIS

View full text Add to dashboard Cite

Ubiquitous sensor stations continuously measure several geophysical fields over large zones and long (potentially unbounded) periods of time. However, observations can never cover every location nor every time. In addition, due to its huge volume, the data produced cannot be entirely recorded for future analysis. In this scenario, interpolation, i.e., the estimation of unknown data in each location or time of interest, can be used to supplement station records. Although in GIScience there has been a tendency to treat space and time separately, integrating space and time could yield better results than treating them separately when interpolating geophysical fields. According to this idea, a spatiotemporal interpolation process, which accounts for both space and time, is described here. It operates in two phases. First, the exploration phase addresses the problem of interaction. This phase is performed on-line using data recorded from a network throughout a time window. The trend cluster discovery process determines prominent data trends and geographically-aware station interactions in the window. The result of this process is given before a new data window is recorded. Second, the estimation phase uses the inverse distance weighting approach both to approximate observed data and to estimate missing data. The proposed technique has been evaluated using two large real climate sensor networks. The experiments empirically demonstrate that, in spite of a notable reduction in the volume of data, the technique guarantees accurate estimation of missing data

show abstract

A Sliding Window Algorithm for Relational Frequent Patterns Mining from Data Streams

Fumarola

Ciampi

Appice

et al. 2009

View full text Add to dashboard Cite

Abstract. Some challenges in frequent pattern mining from data streams are the drift of data distribution and the computational efficiency. In this work an additional challenge is considered: data streams describe complex objects modeled by multiple database relations. A multi-relational data mining algorithm is proposed to efficiently discover approximate relational frequent patterns over a sliding time window of a complex data stream. The effectiveness of the method is proved on application to the Internet packet stream.

show abstract

Data Mining Techniques in Sensor Networks

Appice¹,

Ciampi²,

Fumarola³

et al. 2014

View full text Add to dashboard Cite

Summarization for Geographically Distributed Data Streams

Ciampi

Appice

Malerba

2010

View full text Add to dashboard Cite

We consider distributed computing environments where geo-referenced sensors feed a unique central server with numeric and uni-dimensional data streams. Knowledge discovery from these geographically distributed data streams poses several challenges including the requirement of data summarization in order to store the streamed data in a central server with a limited memory. We propose an enhanced segmentation algorithm in order to group data sources in the same spatial cluster if they stream data which evolve according to a close trajectory over the time. A trajectory is constructed by tracking only data points which represent a change of trend in the associated spatial cluster. Clusters of trajectories are discovered on-the-fly and stored in the database. Experiments prove effectiveness and accuracy of our approach

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anna Ciampi

Summarizing numeric spatial data streams by trend cluster discovery

Permanent Genetic Resources added to Molecular Ecology Resources Database 1 October 2012–30 November 2012

Dealing with temporal and spatial correlations to classify outliers in geophysical data streams

Online and Offline Trend Cluster Discovery in Spatially Distributed Data Streams

Using trend clusters for spatiotemporal interpolation of missing data in a sensor network

A Sliding Window Algorithm for Relational Frequent Patterns Mining from Data Streams

Data Mining Techniques in Sensor Networks

Summarization for Geographically Distributed Data Streams

Contact Info

Product

Resources

About