SUMMARYLife sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico experiments undertaken by these research groups can be represented as workflows involving the co-ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing, the requirements relate to the sharing of analysis and information resources rather than sharing computational power. The my Grid project has developed the Taverna workbench for the composition and execution of workflows for the life sciences community. This experience paper describes lessons learnt during the development of Taverna. A common theme is the importance of understanding how workflows fit into the scientists' experimental context. The lessons reflect an evolving understanding of life scientists' requirements on a workflow environment, which is relevant to other areas of data intensive and exploratory science.
The growing quantity and distribution of bioinformatics resources means that finding and utilizing them requires a great deal of expert knowledge, especially as many resources need to be tied together into a workflow to accomplish a useful goal. We want to formally capture at least some of this knowledge within a virtual workbench and middleware framework to assist a wider range of biologists in utilizing these resources. Different activities require different representations of knowledge. Finding or substituting a service within a workflow is often best supported by a classification. Marshalling and configuring services is best accomplished using a formal description. Both representations are highly interdependent and maintaining consistency between the two by hand is difficult. We report on a description logic approach using the web ontology language DAML+OIL that uses property based service descriptions. The ontology is founded on DAML-S to dynamically create service classifications. These classifications are then used to support semantic service matching and discovery in a large grid based middleware project my GRID. We describe the extensions necessary to DAML-S in order to support bioinformatics service description; the utility of DAML+OIL in creating dynamic classifications based on formal descriptions; and the implementation of a DAML+OIL ontology service to support partial user-driven service matching and composition.
The reasoning tasks that can be performed with semantic web service descriptions depend on the quality of the domain ontologies used to create these descriptions. However, building such domain ontologies is a time consuming and difficult task.We describe an automatic extraction method that learns domain ontologies for web service descriptions from textual documentations attached to web services. We conducted our experiments in the field of bioinformatics by learning an ontology from the documentation of the web services used in my Grid, a project that supports biology experiments on the Grid. Based on the evaluation of the extracted ontology in the context of the project, we conclude that the proposed extraction method is a helpful tool to support the process of building domain ontologies for web service descriptions.
The Gene Ontology Next Generation Project (GONG) is developing a staged methodology to evolve the current representation of the Gene Ontology into DAML+OIL in order to take advantage of the richer formal expressiveness and the reasoning capabilities of the underlying description logic. Each stage provides a step level increase in formal explicit semantic content with a view to supporting validation, extension and multiple classification of the Gene Ontology. The paper introduces DAML+OIL and demonstrates the activity within each stage of the methodology and the functionality gained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.