To deal with the problem where each instance is associated with multiple labels, a lot of multi-label learning algorithms have been developed in recent years. Some approaches have been proposed to select label-specific features to utilize discriminate features for multi-label classification. Although label correlation has been considered in learning label-specific features, the critical correlation among instances was less taken into account. In this paper, we proposed a new approach called multi-label learning with label-specific features using correlation information (LSF-CI) to learn label-specific features for each label with the consideration of both correlation information in label space and correlation information in feature space. In the LSF-CI, the instance correlation in feature space is computed by a probabilistic neighborhood graph model, and label correlation in label space is computed by cosine similarity. For multi-label data, the LSF-CI has the capability to select Label-specific features for each label as well as classify an unseen instance into a set of relevant labels. To validate the effectiveness of LSF-CI, we conducted comprehensive experiments on eight multi-label datasets. The experimental results demonstrate that the LSF-CI is capable of selecting compact label-specific features, and achieving a competitive performance in comparison with the performances of the existing multi-label learning approaches.
To keep pace with the developments in medical informatics, health medical data is being collected continually. But, owing to the diversity of its categories and sources, medical data has become so complicated in many hospitals that it now needs a clinical decision support (CDS) system for its management. To effectively utilize the accumulating health data, we propose a CDS framework that can integrate heterogeneous health data from different sources such as laboratory test results, basic information of patients, and health records into a consolidated representation of features of all patients. Using the electronic health medical data so created, multilabel classification was employed to recommend a list of diseases and thus assist physicians in diagnosing or treating their patients' health issues more efficiently. Once the physician diagnoses the disease of a patient, the next step is to consider the likely complications of that disease, which can lead to more diseases. Previous studies reveal that correlations do exist among some diseases. Considering these correlations, a k-nearest neighbors algorithm is improved for multilabel learning by using correlations among labels (CML-kNN). The CML- kNN algorithm first exploits the dependence between every two labels to update the origin label matrix and then performs multilabel learning to estimate the probabilities of labels by using the integrated features. Finally, it recommends the top N diseases to the physicians. Experimental results on real health medical data establish the effectiveness and practicability of the proposed CDS framework.
Social tag information has been used by recommender systems to handle the problem of data sparsity. Recently, the relationships between users/items and tags are considered by most tag-induced recommendation methods. However, sparse tag information is challenging to most existing methods. In this paper, we propose an Extended-Tag-Induced Matrix Factorization technique for recommender systems, which exploits correlations among tags derived by co-occurrence of tags to improve the performance of recommender systems, even in the case of sparse tag information. The proposed method integrates coupled similarity between tags, which is calculated by the co-occurrences of tags in the same items, to extend each item's tags. Finally, item similarity based on extended tags is utilized as an item relationship regularization term to constrain the process of matrix factorization. MovieLens dataset and Book-Crossing dataset are adopted to evaluate the performance of the proposed algorithm. The results of experiments show that the proposed method can alleviate the impact of tag sparsity and improve the performance of recommender systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.