Lung cancer is defined as an uncontrolled cell growing in the tissues of lung, which is also said to be lung tumor. The lung cancer is curable in the starting stage, but identifying the lung cancer in starting stage is very difficult. In recent decades, researchers showed great interest on gene level lung cancer identification using shortest path between the lung cancer related genes. Many research has been done to identify the shortest path between the genes, but the conventional methods consumes more time for processing the data. In this research, Protein to Protein Interaction (PPI) structure is constructed from the weighted protein present in the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. For identifying the shortest path between the genes in PPI, an effective algorithm: enhanced Floyd warshall algorithm is proposed. Floyd warshall is efficient in finding the shortest path between the genes and also solves all pairs of shortest path problem. A major drawback of Floyd warshall algorithm is, it works slower than other conventional algorithms designed to perform the same task. To improve the performance of traditional Floyd warshall algorithm, an iterative matrix is used for eliminating the invalid path. Then, the comparison between the proposed method and existing system is given in the experimental result. Experimental outcome shows that the proposed approach improved the time consumption up to 2-3 sec compared to the existing methods: Dijkstra's algorithm and Floyd warshall algorithm.Keywords: Dijkstra's algorithm, Enhanced Floyd warshall algorithm, Protein to protein interaction, Search tool for the retrieval of interacting genes/proteins.
Aim and Background:
In recent periods, micro-array data analysis using soft computing and machine learning techniques gained more interest among researchers to detect prostate cancer. Due to the small sample size of micro-array data with a larger number of attributes, traditional machine learning techniques face difficulty detecting prostate cancer.
Methodology:
The selection of relevant genes exploits useful information about micro-array data, which enhances the accuracy of detection. In this research, the samples are acquired from the gene expression omnibus database, particularly related to the prostate cancer GEO IDs such as GSE 21034, GSE 15484 and GSE 3325/GSE 3998. In addition, ensemble feature optimization technique and Bidirectional Long Short Term Memory (Bi-LSTM) network are employed for detecting prostate cancer from the microarray data of gene expression.
Results:
The ensemble feature optimization technique includes 4 metaheuristic optimizers that select the top 2000 genes from each GEO IDs, which are relevant to prostate cancer. Next, the selected genes are given to the Bi-LSTM network for classifying the normal and prostate cancer subjects.
Conclusion:
The simulation analysis revealed that the ensemble based Bi-LSTM network obtained 99.13%, 98.97%, and 94.12% of accuracy on the GEO IDs like GSE 3325/GSE 3998, GSE 21034, and GSE 15484.
conclusion:
The simulation analysis revealed that the ensemble based Bi-LSTM network obtained 99.13%, 98.97%, and 94.12% of accuracy on the GEO IDs like GSE 3325/GSE 3998, GSE 21034, and GSE 15484.
Prostate Cancer (PC) is the leading cause of mortality among males, therefore an effective system is required for identifying the sensitive bio-markers for early recognition. The objective of the research is to find the potential bio-markers for characterizing the dissimilar types of PC. In this article, the PC-related genes are acquired from the Gene Expression Omnibus (GEO) database. Then, gene selection is accomplished using enhanced Particle Swarm Optimization (PSO) to select the active genes, which are related to the PC. In the enhanced PSO algorithm, the interval-newton approach is included to keep the search space adaptive by varying the swarm diversity that helps to perform the local search significantly. The selected active genes are fed to the random forest classifier for the classification of PC (high and low-risk). As seen in the experimental investigation, the proposed model achieved an overall classification accuracy of 96.71%, which is better compared to the traditional models like naïve Bayes, support vector machine and neural network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.