The genomic sequencing of hundreds of organisms including homo sapiens, and the exponential growth in gene expression and proteomic data for many species has revolutionized research in biology. However, the computational analysis of these burgeoning datasets has been hampered by the sparse successes in combinations of data sources, representations, and algorithms. Here we propose the application of symbolic toolsets from the formal methods community to problems of biological interest, particularly signaling pathways, and more specifically mammalian mitogenic and stress responsive pathways. The results of formal symbolic analysis with extremely efficient representations of biological networks provide insights with potential biological impact. In particular, novel hypotheses may be generated which could lead to wet lab validation of new signaling possibilities. We demonstrate the graphic representation of the results of formal analysis of pathways, including navigational abilities, and describe the logical underpinnings of the approach. In summary, we propose and provide an initial description of an algebra and logic of signaling pathways and biologically plausible abstractions that provide the foundation for the application of highpowered tools such as model checkers to problems of biological interest.
There are currently a large number of “orphan” G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.