1999
DOI: 10.1080/00031305.1999.10474456
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian Data Mining in Large Frequency Tables, with an Application to the FDA Spontaneous Reporting System

Abstract: A common data mining task is the search for associations in large databases. Here we consider the search for "interestingly large" counts in a large frequency table, having millions of cells, most of which have an observed frequency of 0 or 1. We first construct a baseline or null hypothesis expected frequency for each cell, and then suggest and compare screening criteria for ranking the cell deviations of observed from expected count. A criterion based on the results of fitting an empirical Bayes model to the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
529
0

Year Published

2003
2003
2021
2021

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 438 publications
(530 citation statements)
references
References 14 publications
1
529
0
Order By: Relevance
“…MGPS is an innovative data-mining algorithm developed by William DuMouchel which introduces the concept of refining the relative reporting ratio calculation by using Bayesian statistics. [19][20][21] The basic assumption behind MGPS is that each observed count (N; drug/vaccine-event combination) is taken from a Poisson distribution with unknown mean (μ), with an interest center on the ratio λ which equals μ divided by E (the expected count estimated by assuming that the count of all reports for the specific drug/vaccine and the count of all reports for the specific event are independent; Table 1). The essential Bayesian contribution supposes that each λ is drawn from a common prior assumed to be a mixture of two g distributions.…”
Section: Methodsmentioning
confidence: 99%
“…MGPS is an innovative data-mining algorithm developed by William DuMouchel which introduces the concept of refining the relative reporting ratio calculation by using Bayesian statistics. [19][20][21] The basic assumption behind MGPS is that each observed count (N; drug/vaccine-event combination) is taken from a Poisson distribution with unknown mean (μ), with an interest center on the ratio λ which equals μ divided by E (the expected count estimated by assuming that the count of all reports for the specific drug/vaccine and the count of all reports for the specific event are independent; Table 1). The essential Bayesian contribution supposes that each λ is drawn from a common prior assumed to be a mixture of two g distributions.…”
Section: Methodsmentioning
confidence: 99%
“…Among the former, the Reporting Odds Ratio (ROR) is applied by the Netherlands Pharmacovigilance Centre Lareb, whereas the Proportional Reporting Ratio (PRR) was first used by Evans et al [35]. Bayesian methods such as Multi-item Gamma Poisson Shrinker (MGPS) [36] and Bayesian Confidence Propagation Neural network (BCPN) [37] are based on Bayes' law to estimate the probability (posterior probability) that the suspected event occurs given the use of suspect drug.…”
Section: The Use Of Dmasmentioning
confidence: 99%
“…However, the Probability Ratio is very volatile when the expected value is small, which makes it favor the rare combinations rather than significant trends in the data. In order to solve the problem, people use shrinkage [1], [3], [10] to regularize and reduce the volatility of a measure by trading a bias to no correlation for decreased variance. Specifically, we add a continuity correction number to both nominator and denominator.…”
Section: ) Two-by-two Contingency Tablementioning
confidence: 99%