Abstract:Background: Uncovering cellular roles of a protein is a task of tremendous importance and complexity that requires dedicated experimental work as well as often sophisticated data mining and processing tools. Protein functions, often referred to as its annotations, are believed to manifest themselves through topology of the networks of inter-proteins interactions. In particular, there is a growing body of evidence that proteins performing the same function are more likely to interact with each other than with p… Show more
“…All of them were created by manual curation and therefore have limited scope and usability. The results of information extraction by manual curation and by NLP technology were compared previously [3,4]. In brief, they concluded that NLP provides significantly higher recovery rate and consistency of extracted data while manually curated dataset usually provides higher accuracy.…”
Integrated literature-derived protein and chemical knowledge bases can rationalize many aspects of drug development process including drug repositioning and biomarker design.
“…All of them were created by manual curation and therefore have limited scope and usability. The results of information extraction by manual curation and by NLP technology were compared previously [3,4]. In brief, they concluded that NLP provides significantly higher recovery rate and consistency of extracted data while manually curated dataset usually provides higher accuracy.…”
Integrated literature-derived protein and chemical knowledge bases can rationalize many aspects of drug development process including drug repositioning and biomarker design.
“…Therefore, biomedical researchers always wish to explore the pathogenesis of abnormal expressing genes, such as those genes confirmed by cDNA microarray analysis, and furthermore, literature mining is an important method for completing them [7]. At present, there have been a variety of mining tools and methods to extract keywords from literatures or gene networks from functional genes and to make them visualized [8][9][10][11][12][13][14][15]. However, based on specific keywords, these tools cannot automatically and simply establish gene networks that are helpful to explore diseaserelated signal transduction pathways.…”
Tumor metastasis is the leading cause of death for gastric cancer. Metastasis is the main reason for the failure of clinical treatment for gastric cancer. In order to find metastasis-related genes and abnormal signal transduction pathway of high-invasive gastric cancer, samples of gastric cancer with liver metastasis were collected for microarray detection; up-regulated or down-regulated genes in all three cases were simultaneously screened out. Subsequently, from the preliminary screened genes, molecular pathways possibly impacting liver metastasis from gastric cancer were investigated by the Gene Cluster with Literature Profiles (GenCLip) analysis software. Many biological effects including apoptosis have been validated. Functional analysis of differentially expressed genes revealed that a variety of biological pathways, such as blood circulation and gas exchange, vasodilation and vasoconstriction regulation, and immune defense, could be significantly activated. Besides, gene sequences, specific keywords or gene regulatory networks were further searched by GenCLiP. We conclude that data mining allows to quickly identify a series of special signal transduction pathways involving abnormally expressed genes.
“…Option " " is the value to conduct the inflation and expansion in the Markov model; thus, it is possible to see the changes of elements inside the clustering depending on the value of . Another function prediction method is the context-free method; it utilizes the context of several DBs of Uniprot or PubMed [15].…”
Section: Mathematical Problems In Engineeringmentioning
Recently, there is a growing interest in the sequence analysis. In particular, the next generation sequencing (NGS) technique fragments the base sequence and analyzes the functions thereof. Its essential role is to arrange pieces of the base sequence together based on sequencing and to define the functions. The organization of unarranged piece of sequence is one of the active research areas; moreover, definition of gene function automatically is a popular research topic. The previous studies about the automatic gene function have mainly utilized the method that automatically defines protein functions by using the similarities of base sequence or the disclosed database and the protein interaction or context free method. This study aims to predict the category of protein whose function was not defined after learning automatically with GO by extracting the characteristics of protein inside the cluster. This study conducts clustering by using the protein interaction that is generated by the similarities of base sequence under the assumption that the proteins inside the cluster have similar function. The proposed method is to show an optimized result in accordance with the option after finding the option value that can give the outperformed prediction of GO, which classifies the functions based on the IPR and keywords inside the same cluster as the unique features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.