* Background In the search for therapeutic peptides for disease treatments, many efforts have been made to identify various functional peptides from large numbers of peptide sequence databases. In this paper, we propose an effective computational model that uses deep learning and word2vec to predict therapeutic peptides (PTPD). * Results Representation vectors of all k -mers were obtained through word2vec based on k -mer co-existence information. The original peptide sequences were then divided into k -mers using the windowing method. The peptide sequences were mapped to the input layer by the embedding vector obtained by word2vec. Three types of filters in the convolutional layers, as well as dropout and max-pooling operations, were applied to construct feature maps. These feature maps were concatenated into a fully connected dense layer, and rectified linear units (ReLU) and dropout operations were included to avoid over-fitting of PTPD. The classification probabilities were generated by a sigmoid function. PTPD was then validated using two datasets: an independent anticancer peptide dataset and a virulent protein dataset, on which it achieved accuracies of 96% and 94%, respectively. * Conclusions PTPD identified novel therapeutic peptides efficiently, and it is suitable for application as a useful tool in therapeutic peptide design.
Cancer is a highly heterogeneous disease caused by dysregulation in different cell types and tissues. However, different cancers may share common mechanisms. It is critical to identify decisive genes involved in the development and progression of cancer, and joint analysis of multiple cancers may help to discover overlapping mechanisms among different cancers. In this study, we proposed a fusion feature selection framework attributed to ensemble method named Fisher score and Gradient Boosting Decision Tree (FS–GBDT) to select robust and decisive feature genes in high-dimensional gene expression datasets. Joint analysis of 11 human cancers types was conducted to explore the key feature genes subset of cancer. To verify the efficacy of FS–GBDT, we compared it with four other common feature selection algorithms by Support Vector Machine (SVM) classifier. The algorithm achieved highest indicators, outperforms other four methods. In addition, we performed gene ontology analysis and literature validation of the key gene subset, and this subset were classified into several functional modules. Functional modules can be used as markers of disease to replace single gene which is difficult to be found repeatedly in applications of gene chip, and to study the core mechanisms of cancer.
There is emerging evidence of an association between epigenetic modifications, glycemic control and atherosclerosis risk. In this study, we mapped genome-wide epigenetic changes in patients with type 2 diabetes (T2D) and advanced atherosclerotic disease. We performed chromatin immunoprecipitation sequencing (ChIP-seq) using a histone 3 lysine 9 acetylation (H3K9ac) mark in peripheral blood mononuclear cells from patients with atherosclerosis with T2D (n = 8) or without T2D (ND, n = 10). We mapped epigenome changes and identified 23,394 and 13,133 peaks in ND and T2D individuals, respectively. Out of all the peaks, 753 domains near the transcription start site (TSS) were unique to T2D. We found that T2D in atherosclerosis leads to an H3K9ac increase in 118, and loss in 63 genomic regions. Furthermore, we discovered an association between the genomic locations of significant H3K9ac changes with genetic variants identified in previous T2D GWAS. The transcription factor 7-like 2 (TCF7L2) rs7903146, together with several human leukocyte antigen (HLA) variants, were among the domains with the most dramatic changes of H3K9ac enrichments. Pathway analysis revealed multiple activated pathways involved in immunity, including type 1 diabetes. Our results present novel evidence on the interaction between genetics and epigenetics, as well as epigenetic changes related to immunity in patients with T2D and advanced atherosclerotic disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.