Understanding the evolutionary history of living organisms is a central problem in biology. Until recently the ability to infer evolutionary relationships was limited by the amount of DNA sequence data available, but new DNA sequencing technologies have largely removed this limitation. As a result, DNA sequence data are readily available or obtainable for a wide spectrum of organisms, thus creating an unprecedented opportunity to explore evolutionary relationships broadly and deeply across the Tree of Life. Unfortunately, the algorithms used to infer evolutionary relationships are NP-hard, so the dramatic increase in available DNA sequence data has created a commensurate increase in the need for access to powerful computational resources. Local laptop or desktop machines are no longer viable for analysis of the larger data sets available today, and progress in the field relies upon access to large, scalable high-performance computing resources. This paper describes development of the CIPRES Science Gateway, a web portal designed to provide researchers with transparent access to the fastest available community codes for inference of phylogenetic relationships, and implementation of these codes on scalable computational resources. Meeting the needs of the community has included developing infrastructure to provide access, working with the community to improve existing community codes, developing infrastructure to insure the portal is scalable to the entire systematics community, and adopting strategies that make the project sustainable by the community. The CIPRES Science Gateway has allowed more than 1800 unique users to run jobs that required 2.5 million Service Units since its release in December 2009.
The CIPRES Science Gateway is a community web application that provides public access to a set of parallel tree inference and multiple sequence alignment codes run on large computational resources. These resources are made available at no charge to users by the NSF Extreme Science and Engineering Discovery Environment (XSEDE) project. Here we describe the CIPRES RESTful application programmer interface (CRA), a web service that provides programmatic access to all resources and services currently offered by the CIPRES Science Gateway. Software developers can use the CRA to extend their web or desktop applications to include the ability to run MrBayes, BEAST, RAxML, MAFFT, and other computationally intensive algorithms on XSEDE. The CRA also makes it possible for individuals with modest scripting skills to access the same tools from the command line using curl, or through any scripting language. This report describes the CRA and its use in three web applications (Influenza Research Database – www.fludb.org, Virus Pathogen Resource – www.viprbrc.org, and MorphoBank – www.morphobank.org). The CRA is freely accessible to registered users at https://cipresrest.sdsc.edu/cipresrest/v1; supporting documentation and registration tools are available at https://www.phylo.org/restusers.
The CIPRES Science Gateway (CSG) provides researchers and educators with browser-based access to community codes for inference of phylogenetic relationships from DNA and protein sequence data. The CSG allows users to deploy jobs on the highperformance computers of the TeraGrid without requiring detailed knowledge of their complexities. Use of the CSG has grown rapidly; through March 2011 it had more than 2,200 users and enabled more than 180 peer-reviewed publications. The rapid growth in resource consumption was accommodated by deploying codes on Trestles, a new TeraGrid computer. Tools and policies were developed to insure efficient and effective resource use. This paper describes progress in managing the growth of this public cyberinfrastructure resource and reviews the domain science that it has enabled.
The CIPRES Science Gateway (CSG) provides browser-based access to computationally demanding phylogenetic codes run on large HPC resources. Since its release in December 2009, there has been a sustained, near-linear growth in the rate of CSG use, both in terms of number of users submitting jobs each month and number of jobs submitted. The average amount of computational time used per month by CSG increased more than 5-fold since its initial release. As of April 2012, more than 4,000 unique users have run parallel tree inference jobs on TeraGrid/XSEDE resources using the CSG. The steady growth in resource use suggests that the CSG is meeting an important need for computational resources in the Systematics/Evolutionary Biology community. General TermsDesign, Management
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.