2014
DOI: 10.5120/18960-0288
|View full text |Cite
|
Sign up to set email alerts
|

Big Data Analytics using Hadoop

Abstract: This paper is an effort to present the basic understanding of BIG DATA is and it's usefulness to an organization from the performance perspective. Along-with the introduction of BIG DATA, the important parameters and attributes that make this emerging concept attractive to organizations has been highlighted. The paper also evaluates the difference in the challenges faced by a small organization as compared to a medium or large scale operation and therefore the differences in their approach and treatment of BIG… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…Hadoop – high-availability distributed object-oriented platform – is a group of classes written in Java, open source by the Apache foundation allowing to meet the needs of big data. It contains four basic components (Alam and Ahmed, 2014; Dhyani and Barthwal, 2014). The first, Hadoop Distributed File System (HDFS), is a distributed, expandable and portable file system inspired by the Google File System (GFS).…”
Section: Related Workmentioning
confidence: 99%
“…Hadoop – high-availability distributed object-oriented platform – is a group of classes written in Java, open source by the Apache foundation allowing to meet the needs of big data. It contains four basic components (Alam and Ahmed, 2014; Dhyani and Barthwal, 2014). The first, Hadoop Distributed File System (HDFS), is a distributed, expandable and portable file system inspired by the Google File System (GFS).…”
Section: Related Workmentioning
confidence: 99%
“…With the rapid development and application of Internet of Things, cloud computing and big data technology, the access control scheme of cloud computing platform under big data application environment must be highly scalable, flexible and efficient. However, the access control currently adopted by the traditional big data platform, such as Hadoop [1][2][3], is based on the static policy specified by the user/user group, and cannot be authorized in groups according to multiple attribute tags of users, let alone the dynamic change of permissions according to the changes of users' attributes, which makes it only suitable for the rights management of a small number of users. The data under the environment of big data is large and dynamic, so the access control list is large and difficult to maintain, the phenomenon of over-authorization and under-authorization is more and more serious, and rights management is complex and difficult [4].…”
Section: Introductionmentioning
confidence: 99%
“…These tools run interactively or via a batch job. Similar to Hadoop, all of this software is available through the module environments package [54,61]. Since the investigation was for user interactivity and usability the following was elaborated on:…”
Section: Implementation Of Apache Spark Technology Systemmentioning
confidence: 99%
“…Its speed can be fast because of its judicious use of memory to cache the data, because Spark's main operators of transformations to form actions/filters were applied to an immutable RDD [32][33]. Each transformation produces an RDD that needs to be cached in memory and/or persisted to disk, depending on the user's choice [32,61]. In Spark, transformation was a lazy operator; instead, direct acyclic graphs (DAG) of the RDD transformations build, optimize, and only executed when action applied [32].…”
Section: Implementation Of Apache Spark Technology Systemmentioning
confidence: 99%
See 1 more Smart Citation