Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface.
We describe new algorithms and modules for protein structure prediction available as part of the PROTINFO web server. The modules, comparative and de novo modelling, have significantly improved back-end algorithms that were rigorously evaluated at the sixth meeting on the Critical Assessment of Protein Structure Prediction methods. We were one of four server groups invited to make an oral presentation (only the best performing groups are asked to do so). These two modules allow a user to submit a protein sequence and return atomic coordinates representing the tertiary structure of that protein. The PROTINFO server is available at .
Information about the secondary and tertiary structure of a protein sequence can greatly assist biologists in the generation and testing of hypotheses, as well as design of experiments. The PROTINFO server enables users to submit a protein sequence and request a prediction of the three-dimensional (tertiary) structure based on comparative modeling, fold generation and de novo methods developed by the authors. In addition, users can submit NMR chemical shift data and request protein secondary structure assignment that is based on using neural networks to combine the chemical shifts with secondary structure predictions. The server is available at http://protinfo.compbio.washington.edu.
PsiCSI is a highly accurate and automated method of assigning secondary structure from NMR data, which is a useful intermediate step in the determination of tertiary structures. The method combines information from chemical shifts and protein sequence using three layers of neural networks. Training and testing was performed on a suite of 92 proteins (9437 residues) with known secondary and tertiary structure. Using a stringent cross-validation procedure in which the target and homologous proteins were removed from the databases used for training the neural networks, an average 89% Q3 accuracy (per residue) was observed. This is an increase of 6.2% and 5.5% (representing 36% and 33% fewer errors) over methods that use chemical shifts (CSI) or sequence information (Psipred) alone. In addition, PsiCSI improves upon the translation of chemical shift information to secondary structure (Q3 ס 87.4%) and is able to use sequence information as an effective substitute for sparse NMR data (Q3 ס 86.9% without 13 C shifts and Q3 ס 86.8% with only H ␣ shifts available). Finally, errors made by PsiCSI almost exclusively involve the interchange of helix or strand with coil and not helix with strand (<2.5 occurrences per 10000 residues). The automation, increased accuracy, absence of gross errors, and robustness with regards to sparse data make PsiCSI ideal for high-throughput applications, and should improve the effectiveness of hybrid NMR/de novo structure determination methods. A Web server is available for users to submit data and have the assignment returned.
Given the increasing complexity of bioinformatics workflows, we anticipate that these interactive software notebooks will become as necessary for documenting software methods as traditional laboratory notebooks have been for documenting bench protocols, and as ubiquitous.
Highlights d An open-source graphical tool for constructing and executing bioinformatics workflows d Uses Docker containers to ensure reproducibility and facilitate workflow installation d Interactive graphical modules such as Jupyter and Cytoscape are supported d Workflows can be saved in Bwb's native format or exported as portable shell scripts
Mu B is one of four proteins required for the strand transfer step of bacteriophage Mu DNA transposition and the only one where no high resolution structural data is available. Structural work on Mu B has been hampered primarily by solubility problems and its tendency to aggregate. We have overcome this problem by determination of the three-dimensional structure of the C-terminal domain of Mu B (B 223±312 ) in 1.5 M NaCl using NMR spectroscopic methods. The structure of Mu B 223±312 comprises four helices (backbone r.m.s.d. 0.46 A Ê ) arranged in a loosely packed bundle and resembles that of the N-terminal region of the replication helicase, DnaB. This structural motif is likely to be involved in the inter-domainal regulation of ATPase activity for both Mu A and DnaB. The approach described here for structural determination in high salt may be generally applicable for proteins that do not crystallize and that are plagued by solubility problems at low ionic strength.
We present BioDepot-workflow-Builder (BwB), a portable and open-source tool for creating bioinformatics workflows with a simple drag-and-drop graphical user interface. The individual components of the workflows are Docker containers which are available from public repositories or provided by the user. The use of software containers ensures that workflows will give identical results across different operating systems and hardware architectures. The use of Docker also allows for individual components to be deployed on the cloud. The modularity and ease of customization and installation of bioinformatics tools using BwB allows for researchers to efficiently test new workflows and compare competing algorithms. Since BwB itself is packaged in a Docker container, the setup is minimal. In particular, users only need to install Docker and have access to a web browser to begin creating and running workflows. As a proof-of-concept case study, we illustrated the feasibility of BwB by developing widgets for the RNA-seq differential expression analysis workflow employed by the NIH BD2K-LINCS Drug Toxicity Signature Generation Center at Mount Sinai. The app and all the containers are available on the BioDepot repository (https://hub.docker.com/r/biodepot).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.