2019
DOI: 10.1186/s40537-019-0236-x
|View full text |Cite
|
Sign up to set email alerts
|

Big data clustering with varied density based on MapReduce

Abstract: With the recent growth and advancement in Information Technology, data has produced at a very high rate in a variety of fields, which have presented to users in a structured, semi-structured, and non-structured mode [1]. New technologies for storing and extracting useful information from this volume of data (big data) have needed because the discovery and extraction of useful information and knowledge from this data volume are difficult, hence, other traditional relational databases cannot meet the needs of us… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(14 citation statements)
references
References 27 publications
0
14
0
Order By: Relevance
“…It aims to analyze pre-processed stored data in order to find correlations, identify patterns and create actionable insights. There are mainly four categories through which Big Data analysis could be designed and conducted: prescriptive, predictive, diagnostic and descriptive [32,36,[52][53][54][55][56]. In next, we describe each of these categories:…”
Section: Discussionmentioning
confidence: 99%
“…It aims to analyze pre-processed stored data in order to find correlations, identify patterns and create actionable insights. There are mainly four categories through which Big Data analysis could be designed and conducted: prescriptive, predictive, diagnostic and descriptive [32,36,[52][53][54][55][56]. In next, we describe each of these categories:…”
Section: Discussionmentioning
confidence: 99%
“…To estimate the runtime of a job in Hadoop MapReduce, first, we investigated the anatomy of Hadoop's job and the stages of running a job precisely [1][2][3][4][5]. Since Hadoop works on the repetitive application on the same data type [17], we use the profiling method, which means that there is some separate table in the database for each application.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, the time of each stage must be calculated and summated. Since Hadoop run on a distributed system, many factors and parameters affect T map and T reduce [1][2][3][4][5]22]. So, we investigate the parameters with more impact on runtime.…”
Section: Estimating Runtime For the First Runmentioning
confidence: 99%
See 1 more Smart Citation
“…These models can be applied for evaluating several methods in Map-Reduce. Heidari et al [26] discussed clustering with variable density based on huge data. They presented MR-VDBSCAN in this method.…”
Section: Related Workmentioning
confidence: 99%