2020
DOI: 10.1002/cpe.6164
|View full text |Cite
|
Sign up to set email alerts
|

Big data and machine learning framework for clouds and its usage for text classification

Abstract: Reference architectures for big data and machine learning include not only interconnected building blocks but important considerations (among others) for scalability, manageability and usability issues as well. Leveraging on such reference architectures, the automated deployment of distributed toolsets and frameworks on various clouds is still challenging due to the diversity of technologies and protocols. The paper focuses particularly on the widespread Apache Spark cluster with Jupyter as the particularly ad… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 11 publications
(10 citation statements)
references
References 23 publications
0
10
0
Order By: Relevance
“…Zhao et al [14] suggested that big data has great potential in improving policy description and strengthening policy prediction ability. Pintye et al [15] studied the relevant policy framework of big data and proposed that the whole social governance, policies, and aspects closely related to big data should be considered as a complete system, involving user privacy, data accuracy, data collection methods, and social equity.…”
Section: Related Workmentioning
confidence: 99%
“…Zhao et al [14] suggested that big data has great potential in improving policy description and strengthening policy prediction ability. Pintye et al [15] studied the relevant policy framework of big data and proposed that the whole social governance, policies, and aspects closely related to big data should be considered as a complete system, involving user privacy, data accuracy, data collection methods, and social equity.…”
Section: Related Workmentioning
confidence: 99%
“…Area under the curve (AUC), Matthews correlation coefficient, F1‐score, and precision‐recall curve are used for performance evaluation. AI methods are categorized into machine learning (ML) 13‐16,25,26 and deep learning algorithms 4,10,11,17,18 . ML methods are impressive, though, most of the methods involve manual FE due to the limited ability to manage a large set of features.…”
Section: Related Workmentioning
confidence: 99%
“…The configuration of Spark is adjusted to control the level of parallelism applied to the data. This text analysis application was successfully handled by using the Spark-based reference architecture deployed on the ELKH Cloud, and the scientific findings have been already publicly released [31].…”
Section: Validation By the Hungarian Comparative Agendas Projectmentioning
confidence: 99%