“…Then, an adopted Daugman Iris Recognition algorithm is implemented and complex Gabor response is obtained [7]. Zhang et al first used traditional supervised learning method to construct a bipartite graph [8]. The relationship between test image and training images is used to construct the ranking score and contribution score, and the final classification result is gained.…”
This paper proposes a novel lung nodule classification method for low-dose CT images. The method includes two stages. First, Local Difference Pattern (LDP) is proposed to encode the feature representation, which is extracted by comparing intensity difference along circular regions centered at the lung nodule. Then, the single-center classifier is trained based on LDP. Due to the diversity of feature distribution for different class, the training images are further clustered into multiple cores and the multicenter classifier is constructed. The two classifiers are combined to make the final decision. Experimental results on public dataset show the superior performance of LDP and the combined classifier.
“…Then, an adopted Daugman Iris Recognition algorithm is implemented and complex Gabor response is obtained [7]. Zhang et al first used traditional supervised learning method to construct a bipartite graph [8]. The relationship between test image and training images is used to construct the ranking score and contribution score, and the final classification result is gained.…”
This paper proposes a novel lung nodule classification method for low-dose CT images. The method includes two stages. First, Local Difference Pattern (LDP) is proposed to encode the feature representation, which is extracted by comparing intensity difference along circular regions centered at the lung nodule. Then, the single-center classifier is trained based on LDP. Due to the diversity of feature distribution for different class, the training images are further clustered into multiple cores and the multicenter classifier is constructed. The two classifiers are combined to make the final decision. Experimental results on public dataset show the superior performance of LDP and the combined classifier.
“…To capture the time aspect of the data, Liu et al However, different from our case, none of these methods considered an "unknown" class and they all have predefined instances for all classes, either by experts [60,46,124,73,138,84] or via other mechanisms [97]. In addition, unlike our approach, all the mentioned graph-based SSL methods used homogeneous graphs.…”
“…Semi-Supervised Learning [144] (SSL) that learns from both labeled and unlabeled data has attracted increasing attention in healthcare applications based on EHRs [73,60,97,46,124,61,84,138]. PU learning can be seen as a special case of SSL.…”
“…Their method has better classification accuracy than ANN, SVM, and standard SSL classifiers on most of the datasets. Zhang et al [138] proposed a ranking-based approach for lung nodule image classification. Their algorithm first constructed a bipartite graph that captures the relationship between the labeled and unlabeled instances.…”
“…They either have expert-defined low-risk or control classes [47,121,90,125,75,96] or simply treat non-positive cases as negative [113,22,102]. Methods that consider unlabeled data [106,73,60,97,46,124,61,84,138] are generally based on Semi-Supervised Learning (SSL) [144] that learns from both labeled and unlabeled data. Amongst these SSL methods, only a few [97,46] large unlabeled data, SHG-Health features a semi-supervised learning method that utilizes both labeled and unlabeled instances.…”
The "big data" challenge is changing the way we acquire, store, analyse, and draw conclusions from data. How we effectively and efficiently "mine" the data from possibly multiple sources and extract useful information is a critical question. Increasing research attention has been drawn to healthcare data mining, with an ultimate goal to improve the quality of care. The human body is complex and so too the data collected in treating it. Data noise that is often introduced via the collection process makes building Data Mining models a challenging task.This thesis focuses on the classification tasks of mining healthcare data, with the goal of improving the effectiveness of health risk prediction. In particular, we developed algorithms to address issues identified from real healthcare data, such as feature extraction, heterogeneity, label uncertainty, and large unlabeled data.The three main contributions of this research are as follows. First, we developed a new health index called Personal Health Index (PHI) that scores a person's health status based on the examination records of a given population. Second, we identified the key characteristics of the real datasets and issues that were associated with the data. Third, we developed classification algorithms to cope with those issues, particularly, the label uncertainty and large unlabeled data issues.This research takes one step forward towards scoring personal health based on mining increasingly large health records. Particularly, it pioneers exploring the mining of GHE data and tackles the associated challenges. It is our anticipation that in the near future, more robust data-mining-based health scoring systems will be available for healthcare professionals to understand people's health status and thus improve the quality of care.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.