2011
DOI: 10.1016/j.jclinepi.2011.06.013
|View full text |Cite
|
Sign up to set email alerts
|

Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
60
1
2

Year Published

2013
2013
2019
2019

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 53 publications
(64 citation statements)
references
References 5 publications
1
60
1
2
Order By: Relevance
“…First, we selected the most strongly associated predictors from each subgroup of variables, using multivariable logistic regression with backward stepwise selection based on a likelihood-ratio test with a P value of 0.1. 17 Second, the final model was selected from this set of variables and validated with backward stepwise selection in multivariable logistic regression. One thousand bootstrap samples were drawn from the original sample, estimating the overfitting-corrected regression coefficients from the final model and the overfittingcorrected measures of the model performance.…”
Section: Discussionmentioning
confidence: 99%
“…First, we selected the most strongly associated predictors from each subgroup of variables, using multivariable logistic regression with backward stepwise selection based on a likelihood-ratio test with a P value of 0.1. 17 Second, the final model was selected from this set of variables and validated with backward stepwise selection in multivariable logistic regression. One thousand bootstrap samples were drawn from the original sample, estimating the overfitting-corrected regression coefficients from the final model and the overfittingcorrected measures of the model performance.…”
Section: Discussionmentioning
confidence: 99%
“…In this study, even in the multivariable model, the EPV was !10, indicating acceptable accuracy and precision of regression coefficients (Peduzzi et al, 1996). However, it should be mentioned that there is no confirmed consensus on number of EPVs and some authors have criticized the issue (Courvoisier et al, 2011).…”
Section: Discussionmentioning
confidence: 99%
“…In the latter case, controlling too many variables by conventional means can lead to or aggravate two closely related problems: (a) data sparsity, in which full control results in too few subjects at crucial combinations of the variables, with consequent inflation of estimates (59,76,116,134), and (b) multicollinearity, by which we mean high multiple correlation (or more generally, high association) of the controlled variables with study exposures (116). In particular, if we include covariates that together are highly predictive of an exposure but are not all necessary to control confounding, the resulting effect estimate may be inflated or have unnecessarily wide confidence intervals (15,26,116). These problems increase as the ratio of number of covariates to sample size increases, motivating strategies to reduce the number of modeled covariates (116; S. Greenland & N. Pearce, unpublished manuscript, "Modeling Strategies for Observational Epidemiology").…”
Section: Why Not Adjust For Every Available Covariate?mentioning
confidence: 99%