Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce

Ramírez-Gallego, Sergio; Fernández, Alberto; García, Salvador; Chen, Min; Herrera, Francisco

doi:10.1016/j.inffus.2017.10.001

Cited by 136 publications

(62 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Then, a Reduce node (or several Reduce nodes, depending on the application) combines the outputs produced by each Map task. Therefore, Big Data fusion can be conceived as a means to distribute the complexity of learning a ML model over a pool of Worker nodes, wherein the strategy to design how information/models are fused together between the Map and the Reduce tasks is what defines the quality of the finally generated outcome [413].…”

Section: Emerging Data Fusion Approachesmentioning

confidence: 99%

Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI

et al. 2020

Self Cite

View full text Add to dashboard Cite

In the last few years, Artificial Intelligence (AI) has achieved a notable momentum that, if harnessed appropriately, may deliver the best of expectations over many application sectors across the field. For this to occur shortly in Machine Learning, the entire community stands in front of the barrier of explainability, an inherent problem of the latest techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI (namely, expert systems and rule based models). Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is widely acknowledged as a crucial feature for the practical deployment of AI models. The overview presented in this article examines the existing literature and contributions already done in the field of XAI, including a prospect toward what is yet to be reached. For this purpose we summarize previous efforts made to define explainability in Machine Learning, establishing a novel definition of explainable Machine Learning that covers such prior conceptual propositions with a major focus on the audience for which the explainability is sought. Departing from this definition, we propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at explaining Deep Learning methods for which a second dedicated taxonomy is built and examined in detail. This critical literature analysis serves as the motivating background for a series of challenges faced by XAI, such as the interesting crossroads of data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to the field of XAI with a thorough taxonomy that can serve as reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.

show abstract

Section: Emerging Data Fusion Approachesmentioning

confidence: 99%

Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI

et al. 2020

Self Cite

View full text Add to dashboard Cite

show abstract

“…3-Machine learning module : Scalability is a requirement when it comes to the machine learning module. This requirement is met using a machine learning library called Mahout [103] , thus harnessing the cluster high computational power to achieve optimized results. It is worth noting that Mahout is built on top of Hadoop, and its classification and clustering core algorithms are run as MapReduce jobs.…”

Section: Peer-to-peer Botnet Detectionmentioning

confidence: 99%

Big data analytics for wireless and wired network design: A survey

et al. 2018

View full text Add to dashboard Cite

a b s t r a c tCurrently, the world is witnessing a mounting avalanche of data due to the increasing number of mobile network subscribers, Internet websites, and online services. This trend is continuing to develop in a quick and diverse manner in the form of big data. Big data analytics can process large amounts of raw data and extract useful, smaller-sized information, which can be used by different parties to make reliable decisions.In this paper, we conduct a survey on the role that big data analytics can play in the design of data communication networks. Integrating the latest advances that employ big data analytics with the networks' control/traffic layers might be the best way to build robust data communication networks with refined performance and intelligent features. First, the survey starts with the introduction of the big data basic concepts, framework, and characteristics. Second, we illustrate the main network design cycle employing big data analytics. This cycle represents the umbrella concept that unifies the surveyed topics. Third, there is a detailed review of the current academic and industrial effort s toward network design using big data analytics. Forth, we identify the challenges confronting the utilization of big data analytics in network design. Finally, we highlight several future research directions. To the best of our knowledge, this is the first survey that addresses the use of big data analytics techniques for the design of a broad range of networks.

show abstract

“…Nevertheless, it is necessary the development of more efficient approaches because of the huge amounts of data generated everyday. Nowadays, the most popular paradigm to deal with huge amounts of data is MapReduce [4,28,29]. It is based on the divide-and-conquer programming paradigm and it allows an easy parallel execution throughout several machines.…”

Section: Big Data In Emerging Pattern Miningmentioning

confidence: 99%

Study on the use of different quality measures within a multi-objective evolutionary algorithm approach for emerging pattern mining in big data environments

et al. 2019

View full text Add to dashboard Cite

Background: Emerging pattern mining is a data mining task that extracts rules describing discriminative relationships amongst variables. These rules should be understandable for the experts. Comprehensibility of a rule is traditionally determined by several objectives, which can be calculated by different measures. In this way, multi-objective evolutionary algorithms are suitable for this task. Currently, the growing amount of data makes traditional data mining tasks unable to process them in a reasonable time. These huge amounts of data make even more interesting the extraction of rules that can easily describe the underlying phenomena of this big data. So far there is only one algorithm for emerging pattern mining developed based on multi-objective evolutionary algorithms for big data, the BD-EFEP algorithm. The influence of the selection of different quality measures as objectives in the search process is analysed in this paper. Results:The results show that the use of the combination based on Jaccard index and false positive rate is the one with the best trade-off for descriptive induction of emerging patterns. Conclusions: It is recommended the use of this combination of quality measure as optimisation objectives in future multi-objective evolutionary algorithm developments for emerging pattern mining focused in big data.

show abstract

Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce

Cited by 136 publications

References 40 publications

Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI

Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI

Big data analytics for wireless and wired network design: A survey

Study on the use of different quality measures within a multi-objective evolutionary algorithm approach for emerging pattern mining in big data environments

Contact Info

Product

Resources

About