2016
DOI: 10.1093/biostatistics/kxw018
|View full text |Cite
|
Sign up to set email alerts
|

Study design in high-dimensional classification analysis

Abstract: SUMMARYAdvances in high throughput technology have accelerated the use of hundreds to millions of biomarkers to construct classifiers that partition patients into different clinical conditions. Prior to classifier development in actual studies, a critical need is to determine the sample size required to reach a specified classification precision. We develop a systematic approach for sample size determination in high-dimensional (large p small n) classification analysis. Our method utilizes the probability of c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 22 publications
0
5
0
Order By: Relevance
“…The NxSubsampling and NxCross Validation schemes are limited by high variance at small subsample sizes [38] and low test sample sizes [24], respectively. Attempts to overcome these challenges [19,45,49] remain key to widespread implementation. Finally, validity of the linear curve-fitting approach is highly dependent on the algorithm, number of features, and their distributions [30,42].…”
Section: Limitations To Ssdmsmentioning
confidence: 99%
“…The NxSubsampling and NxCross Validation schemes are limited by high variance at small subsample sizes [38] and low test sample sizes [24], respectively. Attempts to overcome these challenges [19,45,49] remain key to widespread implementation. Finally, validity of the linear curve-fitting approach is highly dependent on the algorithm, number of features, and their distributions [30,42].…”
Section: Limitations To Ssdmsmentioning
confidence: 99%
“…[1,2], full of challenges from a Mathematical Optimization perspective, e.g. [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17].…”
Section: Introductionmentioning
confidence: 99%
“…) has been proposed to predict either individual curves and/or a scalar outcome. For multivariate data, Dobbin & Simon () and Sánchez et al () considered the design for high‐dimensional multivariate data classification and focused on the sample size calculation. In contrast, the design for functional data, such as in our case, is to determine the sampling time points where data will be collected.…”
Section: Introductionmentioning
confidence: 99%