Correcting for misclassification for a monotone disease process with an application in dental research

García-Zattera, María José; Mutsvari, Timothy; Jara, Alejandro; Declerck, Dominique; Lesaffre, Emmanuel

doi:10.1002/sim.3906

Cited by 20 publications

(29 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is important to stress that in a longitudinal setting, unlike cross sectional studies, the model parameters might be estimated without the use of external information about the misclassification parameters. For instance, García-Zattera et al (2010) showed that under simple restrictions on the parameter space, the model parameters associated with an inhomogeneous HMM for monotone responses are identified by the available data. They also proposed a univariate model to account for predictors allowing for irregularly spaced time intervals and different classifiers.…”

Section: Introductionmentioning

confidence: 99%

“…Hidden Markov models (HMM) for the analysis of misclassified alternating longitudinal responses has been considered in the literature by Cook, Ng, and Meade (2000), Rosychuk and Thompson (2001), Rosychuk and Thompson (2003), Nagelkerke, Chunge, and Kinot (1990), and Rosychuk and Islam (2009), whereas Espeland, Murphy, and Leverett (1988), Espeland, Platt, and Gallagher (1989), Schmid, Segal, and Rosner (1994), Singh and Rao (1995), Albert, Hunsberger, and Biro (1997), and García-Zattera et al (2010) addressed the problem of misclassified monotone longitudinal responses. It is important to stress that in a longitudinal setting, unlike cross sectional studies, the model parameters might be estimated without the use of external information about the misclassification parameters.…”

Section: Introductionmentioning

confidence: 99%

“…In the context of longitudinal univariate categorical data, generalized linear mixed models (see, e.g., Neuhaus 2002), generalized estimating equation (GEE)-based approaches (see, e.g., Neuhaus 2002), and transition models (see, e.g., García-Zattera et al 2010) have been proposed for correcting for misclassification. Due to the monotone nature of our motivating problem and because the main scientific objective here is the incidence estimation, we restrict ourselves to the latter class of models, where the parameters have a direct interpretation in terms of the conditional probabilities of developing CE in a given time interval.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Modeling of Multivariate Monotone Disease Processes in the Presence of Misclassification

García-Zattera

Jara

Lesaffre

et al. 2012

Journal of the American Statistical Association

View full text Add to dashboard Cite

Motivated by a longitudinal oral health study, the Signal-Tandmobiel R study, we propose a multivariate binary inhomogeneous Markov model in which unobserved correlated response variables are subject to an unconstrained misclassification process and have a monotone behavior. The multivariate baseline distributions and Markov transition matrices of the unobserved processes are defined as a function of covariates through the specification of compatible full conditional distributions. Distinct misclassification models are discussed. In all cases, the possibility that different examiners were involved in the scoring of the responses of a given subject across time is taken into account. A full Bayesian implementation of the model is described and its performance is evaluated using simulated data. We provide theoretical and empirical evidence that the parameters can be estimated without any external information about the misclassification parameters. Finally, the analyses of the motivating study are presented. Appendices 1-7 are available in the online supplementary materials.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Modeling of Multivariate Monotone Disease Processes in the Presence of Misclassification

García-Zattera

Jara

Lesaffre

et al. 2012

Journal of the American Statistical Association

View full text Add to dashboard Cite

show abstract

“…Label uncertainty has commonly been found in clinical judgments due to expert subjectivity and inadequate information [104]. Often, it is handled as noise, so the task has been to detect and correct such mislabeling [87,107,45]. However, in the case of multiple, non-exclusive medical conditions [82], such as comorbidity, it makes more sense to treat labels with degrees of certainty rather than forcing them to belong to one "true" class, because there is no such thing as a single true class in this kind of scenario.…”

Section: Label Characteristicsmentioning

confidence: 99%

“…Such noise is usually regarded as mislabeling to be detected and corrected [87,107,45]. For example, Garca-Zattera et al employed binary Markov 2.2 DATA MINING ON HEALTHCARE DATA 25 models to estimate misclassification parameters for dental research [45].…”

Section: Classification With Label Uncertaintymentioning

confidence: 99%

Healthcare data mining from multi-source data

Chen¹

View full text Add to dashboard Cite

The "big data" challenge is changing the way we acquire, store, analyse, and draw conclusions from data. How we effectively and efficiently "mine" the data from possibly multiple sources and extract useful information is a critical question. Increasing research attention has been drawn to healthcare data mining, with an ultimate goal to improve the quality of care. The human body is complex and so too the data collected in treating it. Data noise that is often introduced via the collection process makes building Data Mining models a challenging task.This thesis focuses on the classification tasks of mining healthcare data, with the goal of improving the effectiveness of health risk prediction. In particular, we developed algorithms to address issues identified from real healthcare data, such as feature extraction, heterogeneity, label uncertainty, and large unlabeled data.The three main contributions of this research are as follows. First, we developed a new health index called Personal Health Index (PHI) that scores a person's health status based on the examination records of a given population. Second, we identified the key characteristics of the real datasets and issues that were associated with the data. Third, we developed classification algorithms to cope with those issues, particularly, the label uncertainty and large unlabeled data issues.This research takes one step forward towards scoring personal health based on mining increasingly large health records. Particularly, it pioneers exploring the mining of GHE data and tackles the associated challenges. It is our anticipation that in the near future, more robust data-mining-based health scoring systems will be available for healthcare professionals to understand people's health status and thus improve the quality of care.

show abstract

References

2012

Bayesian Biostatistics

View full text Add to dashboard Cite

Correcting for misclassification for a monotone disease process with an application in dental research

Cited by 20 publications

References 30 publications

Modeling of Multivariate Monotone Disease Processes in the Presence of Misclassification

Modeling of Multivariate Monotone Disease Processes in the Presence of Misclassification

Healthcare data mining from multi-source data

References

Contact Info

Product

Resources

About