2015
DOI: 10.1177/0962280215601873
|View full text |Cite
|
Sign up to set email alerts
|

A new approach to categorising continuous variables in prediction models: Proposal and validation

Abstract: When developing prediction models for application in clinical practice, health practitioners usually categorise clinical variables that are continuous in nature. Although categorisation is not regarded as advisable from a statistical point of view, due to loss of information and power, it is a common practice in medical research. Consequently, providing researchers with a useful and valid categorisation method could be a relevant issue when developing prediction models. Without recommending categorisation of c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
65
0
3

Year Published

2017
2017
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 65 publications
(69 citation statements)
references
References 33 publications
(55 reference statements)
1
65
0
3
Order By: Relevance
“…The binary outcome variable was the presence/absence of SEs (regardless of their actual number). In order to make the logistic regression estimates more easily interpretable, continuous variable such as age and year of inception were discretized into binary variables, using an optimal cut-point search algorithm [11]. The algorithm used determines the number and the location of the cut-points using the area under the curve (AUC) of the logistic model, suitably correcting the AUC obtained, which may be biased upward when the same data-set is used both to fit the logistic regression model (involved in the cut-point selection process) and compute the AUC [11].…”
Section: Discussionmentioning
confidence: 99%
“…The binary outcome variable was the presence/absence of SEs (regardless of their actual number). In order to make the logistic regression estimates more easily interpretable, continuous variable such as age and year of inception were discretized into binary variables, using an optimal cut-point search algorithm [11]. The algorithm used determines the number and the location of the cut-points using the area under the curve (AUC) of the logistic model, suitably correcting the AUC obtained, which may be biased upward when the same data-set is used both to fit the logistic regression model (involved in the cut-point selection process) and compute the AUC [11].…”
Section: Discussionmentioning
confidence: 99%
“…This process was carried out separately for each macroregion. We categorised the data into quintiles rather than apply polynomial terms in order to aid interpretation and therefore help guide policymakers 22. Furthermore, this approach also allowed us to generate the marginal estimates, which were plotted.…”
Section: Methodsmentioning
confidence: 99%
“…The continuous variable was categorized if the visual evaluation shown a nonlinear trend. We used the optimization methodology of the R package CatPredi, as described (BARRIO et al, 2017). Secondly, regarding the categorical variables, we used a chi-square or Fisher's exact test on the results, and then we included into the model the various predictor variables, and all the variables with a value of p < 0.20.…”
Section: Seroprevalence and Statistical Analysismentioning
confidence: 99%