2018
DOI: 10.1080/03610918.2018.1521974
|View full text |Cite
|
Sign up to set email alerts
|

Monte Carlo studies of bootstrap variability in ROC analysis with data dependency

Abstract: ROC analysis involving two large datasets is an important method for analyzing statistics of interest for decision making of a classifier in many disciplines. And data dependency due to multiple use of the same subjects exists ubiquitously in order to generate more samples because of limited resources. Hence, a two-layer data structure is constructed and the nonparametric twosample two-layer bootstrap is employed to estimate standard errors of statistics of interest derived from two sets of data, such as a wei… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 16 publications
0
10
0
Order By: Relevance
“…Indeed, to reduce the bootstrap variance and ensure the computation accuracy, our prior rigorous statistical research was carried out, such as the bootstrap variability studies that took months of CPU time to determine the appropriate number of bootstrap replications, the validation study by comparing the SEs of AUC estimated using the bootstrap algorithm on large i.i.d. datasets against those computed using the well-established analytical Mann-Whitney statistic method, using the multinomial probabilities to determine which bootstrap approach in the two-layer data structure should be used, and so on [4,[10][11][17][18][19][20][21][22].…”
Section: Conclusion and Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Indeed, to reduce the bootstrap variance and ensure the computation accuracy, our prior rigorous statistical research was carried out, such as the bootstrap variability studies that took months of CPU time to determine the appropriate number of bootstrap replications, the validation study by comparing the SEs of AUC estimated using the bootstrap algorithm on large i.i.d. datasets against those computed using the well-established analytical Mann-Whitney statistic method, using the multinomial probabilities to determine which bootstrap approach in the two-layer data structure should be used, and so on [4,[10][11][17][18][19][20][21][22].…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…In our ROC analysis for decision making of classifiers, data samples of scores are over tens of thousands and have no parametric model to fit, the statistics of interest are mostly probabilities or a weighted sum of probabilities, and data dependency may be involved. Thus, to reduce the bootstrap variance and ensure the computation accuracy, the bootstrap variabilities were re-studied, which took months of CPU time, and the appropriate number of bootstrap replications B under the above circumstances was determined to be 2,000 [11,[18][19][20][21][22].…”
Section: The Number Of Bootstrap Replicationsmentioning
confidence: 99%
See 2 more Smart Citations
“…Here, a bootstrap confidence interval (1000 resamples; a random number seed of 978) was employed to generate a 95% CI of AUC. A tolerance 0.02 with 1000 replications was considered appropriate [27].…”
Section: Statistical Model Developmentmentioning
confidence: 99%