The heart is an essential organ in the human body. It contains various types of cells, such as cardiomyocytes, mesothelial cells, endothelial cells, and fibroblasts. The interactions between these cells determine the vital functions of the heart. Therefore, identifying the different cell types and revealing the expression rules in these cell types are crucial. In this study, multiple machine learning methods were used to analyze the heart single-cell profiles with 11 different heart cell types. The single-cell profiles were first analyzed via light gradient boosting machine method to evaluate the importance of gene features on the profiling dataset, and a ranking feature list was produced. This feature list was then brought into the incremental feature selection method to identify the best features and build the optimal classifiers. The results suggested that the best decision tree (DT) and random forest classification models achieved the highest weighted F1 scores of 0.957 and 0.981, respectively. The selected features, such as NPPA, LAMA2, DLC1, and the classification rules extracted from the optimal DT classifier played a crucial role in cardiac structure and function in recent research and enrichment analysis. In particular, some lncRNAs (LINC02019, NEAT1) were found to be quite important for the recognition of different cardiac cell types. In summary, these findings provide a solid academic foundation for the development of molecular diagnostics and biomarker discovery for cardiac diseases.
Atopic dermatitis and psoriasis are members of a family of inflammatory skin disorders. Cellular immune responses in skin tissues contribute to the development of these diseases. However, their underlying immune mechanisms remain to be fully elucidated. We developed a computational pipeline for analyzing the single-cell RNA-sequencing profiles of the Human Cell Atlas skin dataset to investigate the pathological mechanisms of skin diseases. First, we applied the maximum relevance criterion and the Boruta feature selection method to exclude irrelevant gene features from the single-cell gene expression profiles of inflammatory skin disease samples and healthy controls. The retained gene features were ranked by using the Monte Carlo feature selection method on the basis of their importance, and a feature list was compiled. This list was then introduced into the incremental feature selection method that combined the decision tree and random forest algorithms to extract important cell markers and thus build excellent classifiers and decision rules. These cell markers and their expression patterns have been analyzed and validated in recent studies and are potential therapeutic and diagnostic targets for skin diseases because their expression affects the pathogenesis of inflammatory skin diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.