2017 IEEE International Conference on Big Data (Big Data) 2017
DOI: 10.1109/bigdata.2017.8257930
|View full text |Cite
|
Sign up to set email alerts
|

Low-latency multi-threaded ensemble learning for dynamic big data streams

Abstract: Abstract-Real-time mining of evolving data streams involves new challenges when targeting today's application domains such as the Internet of the Things: increasing volume, velocity and volatility requires data to be processed on-thefly with fast reaction and adaptation to changes. This paper presents a high performance scalable design for decision trees and ensemble combinations that makes use of the vector SIMD and multicore capabilities available in modern processors to provide the required throughput and a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 12 publications
0
8
0
Order By: Relevance
“…Another approach for distributed systems is the Streaming Parallel Decision Tree algorithm (SPDT) [2]. Marrón et al [33] propose a hardware approach to improve Hoeffding trees, by parallelizing the execution of random forests of Hoeffding trees and creating specific hardware configurations. Another streaming algorithm that was improved in terms of energy efficiency was the KNN version with self-adjusting memory [30].…”
Section: Related Workmentioning
confidence: 99%
“…Another approach for distributed systems is the Streaming Parallel Decision Tree algorithm (SPDT) [2]. Marrón et al [33] propose a hardware approach to improve Hoeffding trees, by parallelizing the execution of random forests of Hoeffding trees and creating specific hardware configurations. Another streaming algorithm that was improved in terms of energy efficiency was the KNN version with self-adjusting memory [30].…”
Section: Related Workmentioning
confidence: 99%
“…Among the works that explored multi-core parallelism, distributed or not, we can further subdivide it into batch [9,18,21,22,25,44] or data stream [20,30,31,37] methods. Many works with various ensemble methods used the Message Passing Interface (MPI) standard, such as for ensembles of improved and faster Support Vector Machine (SVM) [18], bagging decision rule ensembles [20] and regression ensembles [31].…”
Section: Related Workmentioning
confidence: 99%
“…In [37], an ensemble of J48 is parallelized for grid platforms using Java. In [30], a low-latency Hoeffding Tree (HT) is implemented in C++ and used in RFs. In general, the related works mentioned so far differ from the present work in two main aspects: focusing on the implementation and performance aspects of specific ensemble methods or batch approaches (i.e., they do not focus on stream processing).…”
Section: Related Workmentioning
confidence: 99%
“…There have been extensions to these algorithms for distributed systems, such as the Vertical Hoeffding Tree [20], where the authors parallelize the induction of Hoeffding Trees; and the Streaming Parallel Decision Tree algorithm (SPDT). More focused on hardware approaches to improve Hoeffding trees is the work proposed by [21], where they parallelize the execution of random forest of Hoeffding trees, together with a specific hardware configuration to improve induction of Hoeffding trees. Other work has been done where the authors present the energy hotspots of the VFDT [3].…”
Section: B Related Workmentioning
confidence: 99%