2021
DOI: 10.14778/3476249.3476262
|View full text |Cite
|
Sign up to set email alerts
|

Doing more with less

Abstract: Automated machine learning (AutoML) promises to democratize machine learning by automatically generating machine learning pipelines with little to no user intervention. Typically, a search procedure is used to repeatedly generate and validate candidate pipelines, maximizing a predictive performance metric, subject to a limited execution time budget. While this approach to generating candidates works well for small tabular datasets, the same procedure does not directly scale to larger tabular datasets with 100,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 56 publications
0
2
0
Order By: Relevance
“…We have indicated in Section 6 that the performance bottleneck of most cases is "Train" and "Prep". To reduce the evaluation cost and explore more pipelines, simple approximation with a random sample [85] instead of full data can be leveraged. However, the influence function [43] and SHAP [54] [46,50,66,78].…”
Section: Research Opportunitiesmentioning
confidence: 99%
See 1 more Smart Citation
“…We have indicated in Section 6 that the performance bottleneck of most cases is "Train" and "Prep". To reduce the evaluation cost and explore more pipelines, simple approximation with a random sample [85] instead of full data can be leveraged. However, the influence function [43] and SHAP [54] [46,50,66,78].…”
Section: Research Opportunitiesmentioning
confidence: 99%
“…OpenML [27] is a popular AutoML benchmark which utilizes 39 public datasets to evaluate classification performance with different time slots and different metrics. Zogaj et al [85] conduct an extensive empirical study to investigate the impact of downsampling on AutoML results. Several benchmarks are published in the NAS area called NAS-Bench 101 [81], 201 [22] and 301 [68].…”
Section: Research Opportunitiesmentioning
confidence: 99%
“… 15 , 16 Another driving force for Auto ML has been the rise of data democratization, the ongoing process of enabling all individuals, irrespective of technical expertise, to work confidently and comfortably with data and to use it more efficiently and productively. 17 , 18 …”
Section: Introductionmentioning
confidence: 99%
“…This significantly reduces the required expertise and effort to build ML models. 17 , 18 Automation also has the potential to reduce human error and bias, reinforce the replicability of the analyses, and promote collaboration between clinicians and data scientists. Auto ML has been previously used to develop ML models in medical imaging, disease diagnosis, and EHR data analysis.…”
Section: Introductionmentioning
confidence: 99%