A weakly informative default prior distribution for logistic and other regression models

Gelman, Andrew; Jakulin, Aleks; Pittau, Maria Grazia; Su, Yu-Sung

doi:10.1214/08-aoas191

Cited by 1,687 publications

(1,734 citation statements)

References 46 publications

Supporting

Mentioning

1,650

Contrasting

Unclassified

Order By: Relevance

“…Owing to small numbers, the last two groups were merged in the case-control studies. Regularised logistic regression (Gelman et al, 2008) was used to calculate odds ratios (ORs) and 95% confidence intervals (CIs) by education level for cervical cancer in the case-control studies and for HPV infection among control women only in case-control studies and among the general female population in prevalence surveys. The reference category for education was set to the most common category (i.e., 1-5 years in case-control studies and 6-10 years in prevalence surveys).…”

Section: Resultsmentioning

confidence: 99%

Differences in the risk of cervical cancer and human papillomavirus infection by education level

et al. 2009

View full text Add to dashboard Cite

BACKGROUND: Cervical cancer risk is associated with low education even in an unscreened population, but it is not clear whether human papillomavirus (HPV) infection follows the same pattern. METHODS: Two large multicentric studies (case -control studies of cervical cancer and HPV prevalence survey) including nearly 20 000 women. GP5 þ /GP6 þ PCR was used to detect HPV. RESULTS: Education level was consistently associated with cervical cancer risk (odds ratio (OR) for 0 and 45 years vs 1-5 years ¼ 1.50, 95% confidence interval (CI): 1.25 -1.80 and 0.69, 95% CI: 0.57 -0.82, respectively, P for trend o0.0001). In contrast, no association emerged between education level and HPV infection in either of the two IARC studies. A majority of the women studied had never had a Pap smear. The association between low education level and cervical cancer was most strongly attenuated by adjustment for age at first sexual intercourse and first pregnancy. Parity and screening history (but not lifetime number of sexual partners, husband's extramarital sexual relationships, and smoking) also seemed to be important confounding factors. CONCLUSION: The excess of cervical cancer found in women with a low socio-economic status seems, therefore, not to be explained by a concomitant excess of HPV prevalence, but rather by early events in a woman's sexually active life that may modify the cancercausing potential of HPV infection.

show abstract

Section: Resultsmentioning

confidence: 99%

Differences in the risk of cervical cancer and human papillomavirus infection by education level

et al. 2009

View full text Add to dashboard Cite

show abstract

“…Since it is possible for a particular θ c to be associated with an empty cluster, these parameters must be assigned a proper prior. Therefore, we assign to each θ c a proper t density function with 7 degrees of freedom and scale 2.5 as a prior, as discussed by Gelman et al, 29 which corresponds to the baseline case of onehalf of a success and one-half of a failure for a single binomial trial with probability p = logit À1 (θ c ). Our response model, which links the clusters with poverty counts, y i for CT, i, is simply y i ≈ Bin (n i ,p i ) with…”

Section: Methodsmentioning

confidence: 99%

Identifying Vulnerable Populations through an Examination of the Association Between Multipollutant Profiles and Poverty

Molitor

et al. 2011

Environ. Sci. Technol.

View full text Add to dashboard Cite

show abstract

“…Model comparison is a powerful tool for evaluating alternative models given the data [12,17]. However, given so many predictors, the best model is often one of a large set of models with very similar AIC values (e.g.…”

mentioning

confidence: 99%

Problems modelling behavioural variation across Western North American Indian societies

2016

View full text Add to dashboard Cite

Mathew & Perreault [1] analyse cross-cultural data from the Western North American Indian (WNAI) dataset [2] in order to compare 'the relative effect of environment and cultural history' on behavioural variation across 172 societies. This endeavour is inspired by many other evolutionary studies of human cultural variation [3][4][5][6][7]. Mathew and Perreault conclude that 'social learning operating over multiple generations [is] the main mode by which humans acquire their behaviour' (p. 5). Our own investigation of cultural macroevolution in the WNAI [8] motivated us to attempt to reconstruct their analyses. we found their paper to be undermined by questionable analytical choices, and computational and data-handling problems. We draw this conclusion having used the information in the Methods and electronic supplementary material S1, S3, S4 and S6 in [1] to recreate those parts of their study that we were able to. In this commentary, we present the results of our examination and detail the serious methodological flaws that lead us to conclude that a complete re-analysis is required by Mathew and Perreault. We also comment briefly on their conceptual schema, which, in trying to find 'the main mode of human adaptation' [1], appears to set cultural transmission (i.e. social learning) in opposition to environmental adaptation.Mathew and Perreault use logistic regression to model 457 present/absent behavioural traits as a function of three dimensions-E (local ecological conditions), P ( phylogenetic or linguistic distance to other societies) and S (spatial or geographical distance to other societies). To judge the relative importance of the E, P and S classes of predictors, Mathew and Perreault compare sums of absolute values of regression coefficients across classes, for the best model of each behavioural trait. The 'summed absolute values' metric is used for various purposes in model and feature selection [9,10]. The metric is problematic here, however, because it compounds statistical signal with different sizes of the E, P and S classes. To demonstrate, consider a null case in which none of the predictors in E, P or S are related to a trait and the regression coefficients resemble stochastic noise. The analyst must understand how a statistical metric would behave in such a case and choose an inference procedure that reliably distinguishes null from non-null cases. For concreteness, assume that the coefficients share a common Gaussian distribution with mean zero and variance s 2 . The absolute value of a coefficient b then has expectation ð2s 2 =pÞ 1=2 , and for a class containing M predictors the summed absolute values have expectation E½ P M i¼1 jb i j ¼ Mð2s 2 =pÞ 1=2 . The null expectation therefore scales linearly with class size M, and larger classes of predictors will appear to have greater relative importance based on representation alone. Although model selection criteria such as AIC (discussed below) include a penalty for the number of predictors in a model, this does not mitigate the confounding effects...

show abstract

A weakly informative default prior distribution for logistic and other regression models

Cited by 1,687 publications

References 46 publications

Differences in the risk of cervical cancer and human papillomavirus infection by education level

Differences in the risk of cervical cancer and human papillomavirus infection by education level

Identifying Vulnerable Populations through an Examination of the Association Between Multipollutant Profiles and Poverty

Problems modelling behavioural variation across Western North American Indian societies

Contact Info

Product

Resources

About