2020
DOI: 10.1109/access.2020.2988120
|View full text |Cite
|
Sign up to set email alerts
|

Sampling for Big Data Profiling: A Survey

Abstract: Due to the development of internet technology and computer science, data is exploding at an exponential rate. Big data brings us new opportunities and challenges. On the one hand, we can analyze and mine big data to discover hidden information and get more potential value. On the other hand, the 5V characteristic of big data, especially Volume which means large amount of data, brings challenges to storage and processing. For some traditional data mining algorithms, machine learning algorithms and data profilin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(12 citation statements)
references
References 103 publications
1
9
0
Order By: Relevance
“…From this, we conclude that the number of machines and their properties are not sufficient to achieve better execution time, but rather it requires more efficient methods and strategies. This is what (Liu and Zhang, 2020) confirmed in his proposed work. From results obtained by Pandey and Shukla (2019), our proposed work remains better in terms of increase in precision, as well as in execution time, although in their experiments they used small dataset.…”
Section: Experiments Results and Discussionsupporting
confidence: 84%
See 2 more Smart Citations
“…From this, we conclude that the number of machines and their properties are not sufficient to achieve better execution time, but rather it requires more efficient methods and strategies. This is what (Liu and Zhang, 2020) confirmed in his proposed work. From results obtained by Pandey and Shukla (2019), our proposed work remains better in terms of increase in precision, as well as in execution time, although in their experiments they used small dataset.…”
Section: Experiments Results and Discussionsupporting
confidence: 84%
“…Among the works presented in this survey (Approximate cluster computing for big data analysis) are somewhat similar to our proposed work. Thanks to a survey realized by Liu and Zhang (2020) they confirmed us again that sampling methods reduce the volume of big data more effectively and help to speed up its processing. So, it plays an important role in the era of big data, now and in the future.…”
Section: The Final Version Of Our Proposed Workmentioning
confidence: 89%
See 1 more Smart Citation
“…Sampling techniques other than random sampling provide view point to focus on different characteristics of data without compromising quality in data exploration. Liu and Zhang (2020) presented a survey based on sampling and profiling over big data. Authors advocated that sampling technologies play an important role in the era of big data and it is indispensable step in big data processing in future.…”
Section: A Reviewmentioning
confidence: 99%
“…Data mining is the process of extracting information and data from large databases or information repositories. The work became very interesting, attracted too many researchers and developers, and made good progress for several years [1]. Since the 1990s, the concept of data mining has been generally seen as a "mining" process.…”
Section: Introductionmentioning
confidence: 99%