Diabetic nephropathy (DN) is the main cause of end stage renal disease (ESRD). Glomerulus damage is one of the primary pathological changes in DN. To reveal the gene expression alteration in the glomerulus involved in DN development, we screened the Gene Expression Omnibus (GEO) database up to December 2020. Eleven gene expression datasets about gene expression of the human DN glomerulus and its control were downloaded for further bioinformatics analysis. By using R language, all expression data were extracted and were further cross-platform normalized by Shambhala. Differentially expressed genes (DEGs) were identified by Student's t-test coupled with false discovery rate (FDR) (P < 0.05) and fold change (FC) ≥1.5. DEGs were further analyzed by the Database for Annotation, Visualization, and Integrated Discovery (DAVID) to enrich the Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. We further constructed a protein-protein interaction (PPI) network of DEGs to identify the core genes. We used digital cytometry software CIBERSORTx to analyze the infiltration of immune cells in DN. A total of 578 genes were identified as DEGs in this study. Thirteen were identified as core genes, in which LYZ, LUM, and THBS2 were seldom linked with DN. Based on the result of GO, KEGG enrichment, and CIBERSORTx immune cells infiltration analysis, we hypothesize that positive feedback may form among the glomerulus, platelets, and immune cells. This vicious cycle may damage the glomerulus persistently even after the initial high glucose damage was removed. Studying the genes and pathway reported in this study may shed light on new knowledge of DN pathogenesis.
An attribute category clustering method based on hierarchical clustering is proposed in order to study the big data intelligent analysis and processing technology. The proposed model combines the attribute categories with similar fault type distribution, reduces the data dimension, and binarizes it. To address the problem of more missing values of continuous data, a data completion method based on attribute distribution function is adopted. Through the perspective of selection and estimation of project unit price in construction enterprises, this paper summarizes the data mining process facing the characteristics of project cost data, and puts forward the method of analyzing and processing project cost data based on clustering algorithm. Finally, the processed data sets are subjected to bottom-up hierarchical clustering analysis, and finally the ideal analysis results can be obtained. The experimental results show that the preprocessing method based on attribute clustering proposed in this paper can effectively merge attributes, reduce the dimension after binary transformation and effectively reduce the amount of data under the condition of ensuring data information.Povzetek: S hierarhičnim gručenjem je narejena inteligentna analiza velikih podatkov.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.