2006
DOI: 10.1093/biostatistics/kxj036
|View full text |Cite
|
Sign up to set email alerts
|

Sample size planning for developing classifiers using high-dimensional DNA microarray data

Abstract: Many gene expression studies attempt to develop a predictor of pre-defined diagnostic or prognostic classes. If the classes are similar biologically, then the number of genes that are differentially expressed between the classes is likely to be small compared to the total number of genes measured. This motivates a two-step process for predictor development, a subset of differentially expressed genes is selected for use in the predictor and then the predictor constructed from these. Both these steps will introd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
103
0

Year Published

2007
2007
2019
2019

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 119 publications
(103 citation statements)
references
References 12 publications
0
103
0
Order By: Relevance
“…In that way, one can identify the differentially expressed genes and determine how to combine and weight expression levels for the component genes and to establish a cutoff point that optimizes predictive accuracy of the classifier. Dobbin et al (23,24) have developed methods for planning the number of cases needed to effectively develop such a classifier. Larger phase II studies may be required to have sufficient responders in the phase II database for this approach (25).…”
Section: Prognostic and Predictive Classifiersmentioning
confidence: 99%
“…In that way, one can identify the differentially expressed genes and determine how to combine and weight expression levels for the component genes and to establish a cutoff point that optimizes predictive accuracy of the classifier. Dobbin et al (23,24) have developed methods for planning the number of cases needed to effectively develop such a classifier. Larger phase II studies may be required to have sufficient responders in the phase II database for this approach (25).…”
Section: Prognostic and Predictive Classifiersmentioning
confidence: 99%
“…See Appendix A for additional details. Dobbin and Simon [21] proposed another calculation for the size of the training sample based on the difference between the estimated class prediction function and a class prediction function if the sample size were infinite. Their method requires an estimate of the largest effect size of any gene.…”
Section: Designing Microarray Studiesmentioning
confidence: 99%
“…For the size of the test sample we follow similar calculations in Baker et al [22] and Dobbin and Simon [21]. We chose a size to yield a given standard error for a target performance level, with computations based on a binomial distribution for false and true positives.…”
Section: Designing Microarray Studiesmentioning
confidence: 99%
“…Hwang and others, 2002), few sample size methods for building classifiers are available. One ground breaking method for classification analysis was proposed by Dobbin and Simon (2007) (hereafter DS2007), which is based on optimizing the probability of correct classification (PCC, Mukherjee and others, 2003). The classifier's PCC (or sensitivity or specificity) is a more appropriate target for sample size determination for classification studies, rather than the classical concepts of Type I and Type II errors for testing differences across groups.…”
Section: Introductionmentioning
confidence: 99%