BackgroundProteomics Signature Profiling (PSP) is a novel hit-rate based method that proved useful in resolving consistency and coverage issues in proteomics. As a follow-up study, several points need to be addressed: 1/ PSP’s generalisability to pathways, 2/ understanding the biological interplay between significant complexes and pathway subnets co-located on the same pathways on our liver cancer dataset, 3/ understanding PSP’s false positive rate and 4/ demonstrating that PSP works on other suitable proteomics datasets as well as expanding PSP’s analytical resolution via the use of specialised ontologies.Results1/ PSP performs well with Pathway-Derived Subnets (PDSs). Comparing the performance of PDSs derived from various pathway databases, we find that an integrative approach is best for optimising analytical resolution. Feature selection also confirms that significant PDSs are closely connected to the cancer phenotype.2/ In liver cancer, correlation studies of significant PSP complexes and PDSs co-localised on the same pathways revealed an interesting relationship between the purine metabolism pathway and two other complexes involved in DNA repair. Our work suggests progression to poor stage requires additional mutations that disrupt DNA repair enzymes.3/ False positive analysis reveals that PSP, applied on both complexes and PDSs, is powerful and precise.4/ Via an expert-curated lipid ontology, we uncovered several interesting lipid-associated complexes that could be associated with cancer progression. Of particular interest is the HMGB1-HMGB2-HSC70-ERP60-GAPDH complex which is also involved in DNA repair. We also demonstrated generalisability of PSP using a non-small-cell lung carcinoma data set.ConclusionsPSP is a powerful and precise technique, capable of identifying biologically coherent features. It works with biological complexes, network-predicted clusters as well as PDSs. Here, an instance of the interplay between significant PDSs and complexes, possibly significantly involved in liver cancer progression but not well understood as yet, is demonstrated. Also demonstrated is the enhancement of PSP’s analytical resolution using specialised ontologies.
BackgroundThe development of high-throughput experimentation has led to astronomical growth in biologically relevant lipids and lipid derivatives identified, screened, and deposited in numerous online databases. Unfortunately, efforts to annotate, classify, and analyze these chemical entities have largely remained in the hands of human curators using manual or semi-automated protocols, leaving many novel entities unclassified. Since chemical function is often closely linked to structure, accurate structure-based classification and annotation of chemical entities is imperative to understanding their functionality.ResultsAs part of an exploratory study, we have investigated the utility of semantic web technologies in automated chemical classification and annotation of lipids. Our prototype framework consists of two components: an ontology and a set of federated web services that operate upon it. The formal lipid ontology we use here extends a part of the LiPrO ontology and draws on the lipid hierarchy in the LIPID MAPS database, as well as literature-derived knowledge. The federated semantic web services that operate upon this ontology are deployed within the Semantic Annotation, Discovery, and Integration (SADI) framework. Structure-based lipid classification is enacted by two core services. Firstly, a structural annotation service detects and enumerates relevant functional groups for a specified chemical structure. A second service reasons over lipid ontology class descriptions using the attributes obtained from the annotation service and identifies the appropriate lipid classification. We extend the utility of these core services by combining them with additional SADI services that retrieve associations between lipids and proteins and identify publications related to specified lipid types. We analyze the performance of SADI-enabled eicosanoid classification relative to the LIPID MAPS classification and reflect on the contribution of our integrative methodology in the context of high-throughput lipidomics.ConclusionsOur prototype framework is capable of accurate automated classification of lipids and facile integration of lipid class information with additional data obtained with SADI web services. The potential of programming-free integration of external web services through the SADI framework offers an opportunity for development of powerful novel applications in lipidomics. We conclude that semantic web technologies can provide an accurate and versatile means of classification and annotation of lipids.
LipidGO is available at http://compbio.ddns.comp.nus.edu.sg/%7elipidgo/index.php.
Lipids can be systematically classified according to functional properties, structural features, biochemical origin or biological system. However Lipid nomenclature has yet to become a robust research tool since no rigorous definitions exist for membership of specific lipid classes. Lipids need to be defined in a manner that is systematic yet at the same time semantically explicit. We report on the reuse of existing lipid nomenclature, ontology describing chemical structure and the extension of the OWL-DL Lipid Ontology to support the classification of lipid molecules. We applied definitions, DL-axioms, to describe lipids classes and illustrate suitability of the ontology for the classification of Fatty Acyl lipids and Mycolic acids.
Motivation: Although semantic similarity in Gene Ontology (GO) and other approaches may be used to find similar GO terms, there is yet a method to systematically find a class of GO terms sharing a common property with high accuracy (e.g. involving human curation).Results: We have developed a methodology to address this issue and applied it to identify lipid-related GO terms, owing to the important and varied roles of lipids in many biological processes. Our methodology finds lipid-related GO terms in a semi-automated manner, requiring only moderate manual curation. We first obtain a list of lipid-related gold-standard GO terms by keyword search and manual curation. Then, based on the hypothesis that co-annotated GO terms share similar properties, we develop a machine learning method that expands the list of lipid-related terms from the gold standard. Those terms predicted most likely to be lipid related are examined by a human curator following specific curation rules to confirm the class labels. The structure of GO is also exploited to help reduce the curation effort. The prediction and curation cycle is repeated until no further lipid-related term is found. Our approach has covered a high proportion, if not all, of lipid-related terms with relatively high efficiency.Database URL: http://compbio.ddns.comp.nus.edu.sg/∼lipidgo
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.