2014
DOI: 10.1002/gepi.21799
|View full text |Cite
|
Sign up to set email alerts
|

Novel Statistical Tools for Management of Public Databases Facilitate Community-Wide Replicability and Control of False Discovery

Abstract: Issues of publication bias, lack of replicability and false discovery have long plagued the genetics community. Proper utilization of public and shared data resources presents an opportunity to ameliorate these problems. We present an approach to public database management that we term Quality Preserving Database (QPD). It enables perpetual use of the database for testing statistical hypotheses while controlling false discovery and avoiding publication bias on the one hand, and maintaining testing power on the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…Statistical databases. Concerned with the above issues and the importance of data sharing in the genetics community, [RAN14] proposed an approach to public database management, called Quality Preserving Database (QPD). A QPD makes a shared data resource amenable to perpetual use for hypothesis testing while controlling FWER and maintaining statistical power of the tests.…”
Section: Further Related Workmentioning
confidence: 99%
“…Statistical databases. Concerned with the above issues and the importance of data sharing in the genetics community, [RAN14] proposed an approach to public database management, called Quality Preserving Database (QPD). A QPD makes a shared data resource amenable to perpetual use for hypothesis testing while controlling FWER and maintaining statistical power of the tests.…”
Section: Further Related Workmentioning
confidence: 99%
“…There are some initiatives in this direction [22]. Rosset et al [25], very recently, proposed a control via access to a large database. Note that this proposal requires the collection of new samples so that the database is continually enriched and FDR control remains possible.…”
Section: Discussionmentioning
confidence: 99%
“…Thus, there is an extremely large number of hypotheses that have been tested already and are yet-to-be tested by the community of researchers. 7 With hypotheses of this nature, community-wide multiplicity issues arise, which if unaddressed may lead to a high proportion of false discoveries. 7 Even though researchers do in fact account for withinstudy multiple testing, 6 multiplicity resulting from the number of hypotheses tested within the same database but in independent analyses is rarely accounted for.…”
Section: Inference After Data Collectionmentioning
confidence: 99%
“…7 With hypotheses of this nature, community-wide multiplicity issues arise, which if unaddressed may lead to a high proportion of false discoveries. 7 Even though researchers do in fact account for withinstudy multiple testing, 6 multiplicity resulting from the number of hypotheses tested within the same database but in independent analyses is rarely accounted for. The totality of tests performed by other researchers using the same database is typically ignored and each hypothesis is examined in isolation from previous hypotheses that are not part of the particular analysis.…”
Section: Inference After Data Collectionmentioning
confidence: 99%