The rate of progress in human neurosciences is limited by the inability to easily apply a wide range of analysis methods to the plethora of different datasets acquired in labs around the world. In this work, we introduce a framework for creating, testing, versioning and archiving portable applications for analyzing neuroimaging data organized and described in compliance with the Brain Imaging Data Structure (BIDS). The portability of these applications (BIDS Apps) is achieved by using container technologies that encapsulate all binary and other dependencies in one convenient package. BIDS Apps run on all three major operating systems with no need for complex setup and configuration and thanks to the comprehensiveness of the BIDS standard they require little manual user input. Previous containerized data processing solutions were limited to single user environments and not compatible with most multi-tenant High Performance Computing systems. BIDS Apps overcome this limitation by taking advantage of the Singularity container technology. As a proof of concept, this work is accompanied by 22 ready to use BIDS Apps, packaging a diverse set of commonly used neuroimaging algorithms.
The rate of progress in human neurosciences is limited by the inability to easily apply a wide range of analysis methods to the plethora of different datasets acquired in labs around the world. In this work, we introduce a framework for creating, testing, versioning and archiving portable applications for analyzing neuroimaging data organized and described in compliance with the Brain Imaging Data Structure (BIDS). (BIDS Apps) is achieved by using container technologies that encapsulate all binary and other dependencies in one convenient package. BIDS Apps run on all three major operating systems with no need for complex setup and configuration and thanks to the comprehensiveness of the BIDS standard they require little manual user input. Previous containerized data processing solutions were limited to single user environments and not compatible with most multi-tenant High Performance Computing systems. BIDS Apps overcome this limitation by taking advantage of the Singularity container technology. As a proof of concept, this work is accompanied by 22 ready to use BIDS Apps, packaging a diverse set of commonly used neuroimaging algorithms. Author summaryMagnetic Resonance Imaging (MRI) is a non-invasive way to measure human brain structure and activity that has been used for over 25 years. There are thousands MRI studies performed every year generating a substantial amount of data. At the same time, many new data analysis methods are being developed every year. The potential of using new analysis methods on the variety of existing and newly acquired data is hindered by difficulties in software deployment and lack of support for standardized input data. Here we propose to use container technology to make deployment of a wide range of data analysis techniques easy. In addition, we adapt the existing data analysis tools to interface with data organized in a standardized way. We hope that this approach will enable researchers to access a wider range of methods when analyzing their data which will lead to accelerated progress in human neuroscience.
Advanced brain imaging analysis methods, including multivariate pattern analysis (MVPA), functional connectivity, and functional alignment, have become powerful tools in cognitive neuroscience over the past decade. These tools are implemented in custom code and separate packages, often requiring different software and language proficiencies. Although usable by expert researchers, novice users face a steep learning curve. These difficulties stem from the use of new programming languages (e.g., Python), learning how to apply machine-learning methods to high-dimensional fMRI data, and minimal documentation and training materials. Furthermore, most standard fMRI analysis packages (e.g., AFNI, FSL, SPM) focus on preprocessing and univariate analyses, leaving a gap in how to integrate with advanced tools. To address these needs, we developed BrainIAK (brainiak.org), an open-source Python software package that seamlessly integrates several cutting-edge, computationally efficient techniques with other Python packages (e.g., Nilearn, Scikit-learn) for file handling, visualization, and machine learning. To disseminate these powerful tools, we developed user-friendly tutorials (in Jupyter format; https://brainiak.org/tutorials/) for learning BrainIAK and advanced fMRI analysis in Python more generally. These materials cover techniques including: MVPA (pattern classification and representational similarity analysis); parallelized searchlight analysis; background connectivity; full correlation matrix analysis; inter-subject correlation; inter-subject functional connectivity; shared response modeling; event segmentation using hidden Markov models; and real-time fMRI. For longrunning jobs or large memory needs we provide detailed guidance on high-performance computing clusters. These notebooks were successfully tested at multiple sites, including as problem sets for courses at Yale and Princeton universities and at various workshops and hackathons. These materials are freely shared, with the hope that they become part of a pool of open-source software and educational materials for large-scale, reproducible fMRI analysis and accelerated discovery.
Apache Spark is a popular framework for data analytics with attractive features such as fault tolerance and interoperability with the Hadoop ecosystem. Unfortunately, many analytics operations in Spark are an order of magnitude or more slower compared to native implementations written with high performance computing tools such as MPI. There is a need to bridge the performance gap while retaining the benefits of the Spark ecosystem such as availability, productivity, and fault tolerance. In this paper, we propose a system for integrating MPI with Spark and analyze the costs and benefits of doing so for four distributed graph and machine learning applications. We show that offloading computation to an MPI environment from within Spark provides 3.1−17.7× speedups on the four sparse applications, including all of the overheads. This opens up an avenue to reuse existing MPI libraries in Spark with little effort.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.