2023
DOI: 10.1007/s11042-023-17330-5
|View full text |Cite|
|
Sign up to set email alerts
|

Analyzing distributed Spark MLlib regression algorithms for accuracy, execution efficiency and scalability using best subset selection approach

Piyush Sewal,
Hari Singh
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 45 publications
0
0
0
Order By: Relevance
“…The Apache Spark framework having the in-memory computational capability, overcame this limitation with a special type of data structure known as Resilient Distributed Datasets (RDDs) that supports reusability and is capable to store intermediate results in the physical memory of the system. A critical analysis of Hadoop and Spark [6], along with the high accuracy, scalability, and execution efficiency of distributed Spark MLib regression algorithms [7], as well as the performance prediction of Spark workloads using I/O parameters [8], covered in prior studies, sheds light on distributed processing frameworks.…”
mentioning
confidence: 99%
“…The Apache Spark framework having the in-memory computational capability, overcame this limitation with a special type of data structure known as Resilient Distributed Datasets (RDDs) that supports reusability and is capable to store intermediate results in the physical memory of the system. A critical analysis of Hadoop and Spark [6], along with the high accuracy, scalability, and execution efficiency of distributed Spark MLib regression algorithms [7], as well as the performance prediction of Spark workloads using I/O parameters [8], covered in prior studies, sheds light on distributed processing frameworks.…”
mentioning
confidence: 99%