2016 International Conference on Digital Economy (ICDEc) 2016
DOI: 10.1109/icdec.2016.7563142
|View full text |Cite
|
Sign up to set email alerts
|

Sampling algorithms in data stream environments

Abstract: International audienceAbstract:Data streams are large data sets generated continuously and at a fast tempo. Their arrival rate is large compared to the treatment and storage capacities. Thus, these streams cannot be entirely stored. That is why we need to treat them in a single pass, without storing them exhaustively. However, for a particular stream, it is not always possible to predict in advance all of the processing to be performed. It is therefore necessary to save some of this data for future treatments.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 16 publications
(7 citation statements)
references
References 23 publications
0
7
0
Order By: Relevance
“…• Windowing model: sampling techniques use windowing models and divide the traffic into successive windows to limit the number of packets to be analyzed. There are two main windowing models: fixed and sliding [23]. Using a fixed window, the window boundaries are absolute.…”
Section: Taxonomy Of Packets Sampling Policiesmentioning
confidence: 99%
“…• Windowing model: sampling techniques use windowing models and divide the traffic into successive windows to limit the number of packets to be analyzed. There are two main windowing models: fixed and sliding [23]. Using a fixed window, the window boundaries are absolute.…”
Section: Taxonomy Of Packets Sampling Policiesmentioning
confidence: 99%
“…For deterministic methods, there is no randomness in the composition of the sample: for example, selecting all the elements having even indexes. The choice of the appropriate sampling method depends, of course, on the application and the purpose of the sampling (See [18] for more details about sampling algorithms.). The effectiveness of a summary is measured in terms of the accuracy of the provided response, the memory space to store it, and the time to update it [31, 32].…”
Section: Related Workmentioning
confidence: 99%
“…The idea is to use a variation change detection method to check, during a specific period denoted as a jumping window, the variation of the sensed data. A jumping window is a variation of the sliding window, where the offset between two successive windows is equal to the window size [18]. Thus, the SR will be adjusted according to the detected variation.…”
Section: Introductionmentioning
confidence: 99%
“…The initial sampling proportion depends on the number of items seen in the current sliding window and will be adjusted periodically. Reference [37] described the stratified method can generate more representative samples with more accurate results. For online stream processing, [37] proposed two challenges: choosing the size of samples inside each stratum and the number of strata is difficult since the knowledge of data is unknown.…”
Section: Related Workmentioning
confidence: 99%