2018
DOI: 10.1080/01621459.2017.1408468
|View full text |Cite
|
Sign up to set email alerts
|

Information-Based Optimal Subdata Selection for Big Data Linear Regression

Abstract: Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in big data analysis is data reduction. Existing investigations in the context of linear regression focus on subsampling-based methods. However, not only is this approach prone to sampling errors, it also leads to a covariance matrix of the estimators that is typically bounded from below by a term t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
140
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 163 publications
(155 citation statements)
references
References 23 publications
3
140
0
Order By: Relevance
“…Theorem 1 is aligned with Theorem 3 of Wang et al (2019) for the original IBOSS algorithm, which shows that for the subdata D S obtained from Algorithm 1, the following inequality holds.…”
Section: Theoretical Propertiesmentioning
confidence: 66%
See 4 more Smart Citations
“…Theorem 1 is aligned with Theorem 3 of Wang et al (2019) for the original IBOSS algorithm, which shows that for the subdata D S obtained from Algorithm 1, the following inequality holds.…”
Section: Theoretical Propertiesmentioning
confidence: 66%
“…To extract some useful information from the data in time with limited computing resources, a subset of the full data can be selected and thoroughly analyzed. For this purpose, Wang et al (2019) proposed the IBOSS method. We summarize the motivation and procedure in the following.…”
Section: Iboss Framework and Detailed Algorithmmentioning
confidence: 99%
See 3 more Smart Citations