2023
DOI: 10.5705/ss.202020.0303
|View full text |Cite
|
Sign up to set email alerts
|

Model Checking in Large-Scale Dataset via Structure-Adaptive-Sampling

Abstract: Lack-of-fit testing is often essential in many applications of statistical/machine learning. Despite the availability of large-scale datasets, the challenges associated with model checking when some resource budgets are limited are not yet well addressed. In this paper, we propose a design-adaptive testing procedure to check a general model when only a limited number of data observations are available. We derive an optimal sampling strategy to select a small subset from a large pool of data, Structure-Adaptive… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 40 publications
0
0
0
Order By: Relevance
“…For example, the statistical leveraging framework (Drineas et al, 2012;Ma et al, 2015bMa et al, , 2022Li and Meng, 2020) has achieved great success in large-scale ordinary least squares regression. More recently, optimal subsampling procedures have been also established for various statistical models, including logistic regression (Wang et al, 2018), generalized linear models (Ai et al, 2018;Yu et al, 2022), quantile regression (Wang and Ma, 2021), nonparametric regression (Ma et al, 2015a;Meng et al, 2020Meng et al, , 2021, and designed for testing problems (Ren et al, 2022;Han et al, 2023). However, none of the existing can be directly applied to SVM due to its distinguishing geometric feature.…”
Section: Introductionmentioning
confidence: 99%
“…For example, the statistical leveraging framework (Drineas et al, 2012;Ma et al, 2015bMa et al, , 2022Li and Meng, 2020) has achieved great success in large-scale ordinary least squares regression. More recently, optimal subsampling procedures have been also established for various statistical models, including logistic regression (Wang et al, 2018), generalized linear models (Ai et al, 2018;Yu et al, 2022), quantile regression (Wang and Ma, 2021), nonparametric regression (Ma et al, 2015a;Meng et al, 2020Meng et al, , 2021, and designed for testing problems (Ren et al, 2022;Han et al, 2023). However, none of the existing can be directly applied to SVM due to its distinguishing geometric feature.…”
Section: Introductionmentioning
confidence: 99%