2014 IEEE Symposium on Computational Intelligence in Big Data (CIBD) 2014
DOI: 10.1109/cibd.2014.7011537
|View full text |Cite
|
Sign up to set email alerts
|

A scalable machine learning online service for big data real-time analysis

Abstract: Abstract-This work describes a proposal for developing and testing a scalable machine learning architecture able to provide real-time predictions or analytics as a service over domain-independent big data, working on top of the Hadoop ecosystem and providing real-time analytics as a service through a RESTful API. Systems implementing this architecture could provide companies with on-demand tools facilitating the tasks of storing, analyzing, understanding and reacting to their data, either in batch or stream fa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 44 publications
(44 citation statements)
references
References 47 publications
0
43
0
1
Order By: Relevance
“…Representative streaming processing systems include Borealis [98], S4 [99], Kafka [100], and many other recent architectures proposed to provide real-time analytics over big data [101,102]. A scalable machine learning online service with the power of streaming processing for big data real-time analysis is introduced in [103]. In addition, the professor G. B. Giannakis have paid more attention to the real-time processing of streaming data by using machine learning techniques in recent studies; more details can be referred to in [87,104].…”
Section: Possible Remediesmentioning
confidence: 99%
“…Representative streaming processing systems include Borealis [98], S4 [99], Kafka [100], and many other recent architectures proposed to provide real-time analytics over big data [101,102]. A scalable machine learning online service with the power of streaming processing for big data real-time analysis is introduced in [103]. In addition, the professor G. B. Giannakis have paid more attention to the real-time processing of streaming data by using machine learning techniques in recent studies; more details can be referred to in [87,104].…”
Section: Possible Remediesmentioning
confidence: 99%
“…Finally, while it is outside the scope of this evaluation, the time required by the system to provide a prediction is shown in previous work from Baldominos et al [4]. The experiments were performed in a single-node cluster with 8 Intel Xeon processing cores and 16GB of RAM virtualized over VMWare ESXi 5.0, and Hortonworks HDP 2.1 as the Hadoop distribution, which includes Hadoop 2.4 and HBase 0.98; and JBoss AS 7 as application server.…”
Section: Discussionmentioning
confidence: 99%
“…In particular, the Hadoop ecosystem is used following the architecture for big data real-time analysis described in Baldominos et al [4]. The general framework supporting online scalable machine learning over Big Data is shown in Fig.…”
Section: B Big Data Technologymentioning
confidence: 99%
“…Because of the Big Data volume and fast speed, we have used a Big Data architecture based on the one proposed in Baldominos et al [1], but updating the tools to use Apache Spark for the sake of efficiency. Also, a pilot has been conducted to evaluate the performance of the proposed system.…”
Section: State Of the Artmentioning
confidence: 99%