One way to understand the Parkinson’s disease (PD) population is to investigate the similarities and differences among patients through cluster analysis, which may lead to defined, patient subgroups for diagnosis, progression tracking and treatment planning. This paper provides a systematic review of PD patient clustering research, evaluating the variables included in clustering, the cluster methods applied, the resulting patient subgroups, and evaluation metrics. A search was conducted from 1999 to 2021 on the PubMed database, using various search terms including: Parkinson’s disease, cluster, and analysis. The majority of studies included a variety of clinical scale scores for clustering, of which many provide a numerical, but ordinal, categorical value. Even though the scale scores are ordinal, these were treated as numerical values with numerical and continuous values being the focus of the clustering, with limited attention to categorical variables, such as gender and family history, which may also provide useful insights into disease diagnosis, progression, and treatment. The results pointed to two to five patient clusters, with similarities among the age of onset and disease duration. The studies lacked the use of existing clustering evaluation metrics which points to a need for a thorough, analysis framework, and consensus on the appropriate variables to include in cluster analysis. Accurate cluster analysis may assist with determining if PD patients’ symptoms can be treated based on a subgroup of features, if personalized care is required, or if a mix of individualized and group-based care is the best approach.
Parkinson’s disease (PD) is the second most common, neurodegenerative disorder. It is a chronic, disabling, and progressive disease, and no treatment stops its progression. Rating scales are utilized to quantify PD progression and severity. The most conventional scale is the Unified Parkinson’s Disease Rating Scale (UPDRS) and its modified version, Movement Disorder Society- (MDS-) UPDRS. An analytical investigation into the use and meaning of these clinical scale scores was conducted to determine if gaps exist in quantifying disease progression and severity. A series of discrepancies were identified including confusion among patients regarding the score meaning and misuse of the scores among clinicians and researchers to define disease progression. The scales are of an ordinal type and hence the resulting scores are ordinal, not providing a quantifiable progression nor severity level, but a categorical value and survey total. The knowledge that the scores are ordinal and the scales are subjective is mentioned in very limited publications, not the focus of these papers, but a brief introduction and a thoroughly researched, analytical investigation into the scales and scores have not been found. Therefore, the continuous misunderstanding and misuse of these scales and resulting scores warrant a comprehensive assessment and evaluation of these scales and scores to identify the gaps.
Data mining is a technique for analyzing large amounts of data, in various formats, often called Big Data, in order to gain knowledge about it. The healthcare industry is the next Big Data area of interest as its large variability in patients, their health status and their records which can include image scans, graphical test results, and hand-written physician notes, has been untapped for analysis. In addition to data mining, there is a newer analysis method called process mining. Process mining is similar to data mining in that large data files are reviewed and analyzed, but in this case, event logs specific to a particular process or series of processes, are analyzed. Process mining allows one to understand the initial baseline, determine any bottlenecks or resource constraints, and evaluate a recently implemented change. Process mining was conducted on a hospital event log of patients entering the emergency room with sepsis, to better understand this newer analysis method, to highlight the information discovered, and to determine its role with data mining. Not only did the analysis of the event logs provide process mapping and process analysis, but it also highlighted areas in the clinical operations in need of further investigation, including a possible relationship with patient re-admission and their release method. In addition, the data mining method of creating a histogram, of the process data, was applied, allowing data mining and process mining to be utilized complimentary.
Parkinson’s disease (PD) is a chronic disease. No treatment stops its progression, and it presents symptoms in multiple areas. One way to understand the PD population is to investigate the clustering of patients by demographic and clinical similarities. Previous PD cluster studies included scores from clinical surveys, which provide a numerical but ordinal, non-linear value. In addition, these studies did not include categorical variables, as the clustering method utilized was not applicable to categorical variables. It was discovered that the numerical values of patient age and disease duration were similar among past cluster results, pointing to the need to exclude these values. This paper proposes a novel and automatic discovery method to cluster PD patients by incorporating categorical variables. No estimate of the number of clusters is required as input, whereas the previous cluster methods require a guess from the end user in order for the method to be initiated. Using a patient dataset from the Parkinson’s Progression Markers Initiative (PPMI) website to demonstrate the new clustering technique, our results showed that this method provided an accurate separation of the patients. In addition, this method provides an explainable process and an easy way to interpret clusters and describe patient subtypes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.