ObjectivesThe aim of this study was to identify, with soft clustering methods, multimorbidity patterns in the electronic health records of a population ≥65 years, and to analyse such patterns in accordance with the different prevalence cut-off points applied. Fuzzy cluster analysis allows individuals to be linked simultaneously to multiple clusters and is more consistent with clinical experience than other approaches frequently found in the literature.DesignA cross-sectional study was conducted based on data from electronic health records.Setting284 primary healthcare centres in Catalonia, Spain (2012).Participants916 619 eligible individuals were included (women: 57.7%).Primary and secondary outcome measuresWe extracted data on demographics, International Classification of Diseases version 10 chronic diagnoses, prescribed drugs and socioeconomic status for patients aged ≥65. Following principal component analysis of categorical and continuous variables for dimensionality reduction, machine learning techniques were applied for the identification of disease clusters in a fuzzy c-means analysis. Sensitivity analyses, with different prevalence cut-off points for chronic diseases, were also conducted. Solutions were evaluated from clinical consistency and significance criteria.ResultsMultimorbidity was present in 93.1%. Eight clusters were identified with a varying number of disease values: nervous and digestive; respiratory, circulatory and nervous; circulatory and digestive; mental, nervous and digestive, female dominant; mental, digestive and blood, female oldest-old dominant; nervous, musculoskeletal and circulatory, female dominant; genitourinary, mental and musculoskeletal, male dominant; and non-specified, youngest-old dominant. Nuclear diseases were identified for each cluster independently of the prevalence cut-off point considered.ConclusionsMultimorbidity patterns were obtained using fuzzy c-means cluster analysis. They are clinically meaningful clusters which support the development of tailored approaches to multimorbidity management and further research.
This study aimed to analyse the trajectories and mortality of multimorbidity patterns in patients aged 65 to 99 years in Catalonia (Spain). Five year (2012–2016) data of 916,619 participants from a primary care, population-based electronic health record database (Information System for Research in Primary Care, SIDIAP) were included in this retrospective cohort study. Individual longitudinal trajectories were modelled with a Hidden Markov Model across multimorbidity patterns. We computed the mortality hazard using Cox regression models to estimate survival in multimorbidity patterns. Ten multimorbidity patterns were originally identified and two more states (death and drop-outs) were subsequently added. At baseline, the most frequent cluster was the Non-Specific Pattern (42%), and the least frequent the Multisystem Pattern (1.6%). Most participants stayed in the same cluster over the 5 year follow-up period, from 92.1% in the Nervous, Musculoskeletal pattern to 59.2% in the Cardio-Circulatory and Renal pattern. The highest mortality rates were observed for patterns that included cardio-circulatory diseases: Cardio-Circulatory and Renal (37.1%); Nervous, Digestive and Circulatory (31.8%); and Cardio-Circulatory, Mental, Respiratory and Genitourinary (28.8%). This study demonstrates the feasibility of characterizing multimorbidity patterns along time. Multimorbidity trajectories were generally stable, although changes in specific multimorbidity patterns were observed. The Hidden Markov Model is useful for modelling transitions across multimorbidity patterns and mortality risk. Our findings suggest that health interventions targeting specific multimorbidity patterns may reduce mortality in patients with multimorbidity.
Crowdsourced data in science might be severely error-prone due to the inexperience of annotators participating in the project. In this work, we present a procedure to detect specific structures in an image given tags provided by multiple annotators and collected through a crowdsourcing methodology. The procedure consists of two stages based on the Expectation-Maximization (EM) algorithm, one for clustering and the other one for detection, and it gracefully combines data coming from annotators with unknown reliability in an unsupervised manner. An online implementation of the approach is also presented that is well suited to crowdsourced streaming data. Comprehensive experimental results with real data from the MalariaSpot project are also included.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.