2016
DOI: 10.3991/ijes.v4i1.5350
|View full text |Cite
|
Sign up to set email alerts
|

Big Data Mining: Tools & Algorithms

Abstract: We are now in Big Data era, and there is a growing demand for tools which can process and analyze it. Big data analytics deals with extracting valuable information from that complex data which can’t be handled by traditional data mining tools. This paper surveys the available tools which can handle large volumes of data as well as evolving data streams. The data mining tools and algorithms which can handle big data have also been summarized, and one of the tools has been used for mining of large datasets using… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 0 publications
0
2
0
1
Order By: Relevance
“…The challenges related to big data provide a chance to understand the data patterns and helps in prediction of events and results. Hence, there is a growing demand for tools which can process and analyze big data [7]. In this regards, twitter produces humungous amount of data in a daily basis.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The challenges related to big data provide a chance to understand the data patterns and helps in prediction of events and results. Hence, there is a growing demand for tools which can process and analyze big data [7]. In this regards, twitter produces humungous amount of data in a daily basis.…”
Section: Discussionmentioning
confidence: 99%
“…It is an open-source software library written in JAVA programming language allows distributed processing of large amount of data and parallel processing of large datasets on cluster of nodes. Hadoop includes four main modules: (i) Hadoop Common which contains utilities used by other modules; (ii) Hadoop Distributed File System (HDFS) which provides storage capabilities by breaking large files into blocks and storing them in different nodes across a cluster; (iii) Hadoop Map-Reduce to process the large dataset in parallel by each map task work on a part of data input (the final output is processed further in the reduce phase); and (iv) Hadoop YARN which is a resource negotiator for scheduling cluster resources [6], [7].…”
Section: Data Ingestion Phasementioning
confidence: 99%
“…Data Mining merupakan analisis database yang besar untuk menemukan pola dan menghasilkan informasi baru [13]. Model Data Mining di antaranya, Unsupervised Model dan Supervised Model [14]. Sedangkan teknik yang digunakan pada Data Mining ini di antaranya Clustering, Classification dan Regression [15].…”
Section: Pendahuluanunclassified
“…After its emergence in the early 1990s [2], business intelligence was the subject of a series of researches that dealt with various issues related to: data warehousing, the integration of heterogeneous data [3], data mining techniques [4], On-line Analytical Processing (OLAP), etc. Moreover, particular attention was paid to the issue of data warehouse design and the automation of this step given its complexity and importance for all other steps of the decision chain.…”
Section: Introductionmentioning
confidence: 99%