2018
DOI: 10.1016/j.inffus.2017.10.001
|View full text |Cite
|
Sign up to set email alerts
|

Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
60
0
2

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
3

Relationship

4
4

Authors

Journals

citations
Cited by 136 publications
(62 citation statements)
references
References 40 publications
0
60
0
2
Order By: Relevance
“…Then, a Reduce node (or several Reduce nodes, depending on the application) combines the outputs produced by each Map task. Therefore, Big Data fusion can be conceived as a means to distribute the complexity of learning a ML model over a pool of Worker nodes, wherein the strategy to design how information/models are fused together between the Map and the Reduce tasks is what defines the quality of the finally generated outcome [413].…”
Section: Emerging Data Fusion Approachesmentioning
confidence: 99%
“…Then, a Reduce node (or several Reduce nodes, depending on the application) combines the outputs produced by each Map task. Therefore, Big Data fusion can be conceived as a means to distribute the complexity of learning a ML model over a pool of Worker nodes, wherein the strategy to design how information/models are fused together between the Map and the Reduce tasks is what defines the quality of the finally generated outcome [413].…”
Section: Emerging Data Fusion Approachesmentioning
confidence: 99%
“…3-Machine learning module : Scalability is a requirement when it comes to the machine learning module. This requirement is met using a machine learning library called Mahout [103] , thus harnessing the cluster high computational power to achieve optimized results. It is worth noting that Mahout is built on top of Hadoop, and its classification and clustering core algorithms are run as MapReduce jobs.…”
Section: Peer-to-peer Botnet Detectionmentioning
confidence: 99%
“…Nevertheless, it is necessary the development of more efficient approaches because of the huge amounts of data generated everyday. Nowadays, the most popular paradigm to deal with huge amounts of data is MapReduce [4,28,29]. It is based on the divide-and-conquer programming paradigm and it allows an easy parallel execution throughout several machines.…”
Section: Big Data In Emerging Pattern Miningmentioning
confidence: 99%