MALDIquant and its associated R packages readBrukerFlexData and readMzXmlData are freely available from the R archive CRAN (http://cran.r-project.org). The software is distributed under the GNU General Public License (version 3 or later) and is accompanied by example files and data. Additional documentation is available from http://strimmerlab.org/software/maldiquant/.
We present version 2 of the MSnbase R/Bioconductor package. MSnbase provides infrastructure for the manipulation, processing and visualisation of mass spectrometry data. We focus on the new on-disk infrastructure, that allows the handling of large raw mass spectrometry experiment on commodity hardware and illustrate how the package is used for elegant data processing, method development and visualisation. .
Intact protein sequencing by tandem mass spectrometry (MS/MS), known as top-down protein sequencing, relies on efficient gas-phase fragmentation at multiple experimental conditions to achieve extensive amino acid sequence coverage. We developed the "topdownr" R-package for automated construction of multi-modal (i.e. involving CID, HCD, ETD, ETciD, EThcD and UVPD) MS/MS fragmentation methods on an orbitrap instrument platform and systematic analysis of the resultant spectra. We used topdownr to generate and analyze thousands of MS/MS spectra for five intact proteins of 10-30kDa. We achieved 90-100% coverage for the proteins tested and derived guiding principles for efficient sequencing of intact proteins. The data analysis workflow and statistical models of topdownr software and multi-modal MS/MS experiments provide a framework for optimizing MS/MS sequencing for any intact protein. Refined topdownr software will be suited for comprehensive characterization of protein pharmaceuticals and eventually also for de novo sequencing and detailed characterization of intact proteins.
Data visualization plays a key role in high-throughput biology. It is an essential tool for data exploration allowing to shed light on data structure and patterns of interest. Visualization is also of paramount importance as a form of communicating data to a broad audience. Here, we provided a short overview of the application of the R software to the visualization of proteomics data. We present a summary of R's plotting systems and how they are used to visualize and understand raw and processed MS-based proteomics data.
Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting in non-standardized datasets demanding customized annotation workflows. We present an ecosystem of R packages, centered around the MetaboCoreUtils, MetaboAnnotation and CompoundDb packages that together provide a modular infrastructure for the annotation of untargeted metabolomics data. Initial annotation can be performed based on MS1 properties such as m/z and retention times, followed by an MS2-based annotation in which experimental fragment spectra are compared against a reference library. Such reference databases can be created and managed with the CompoundDb package. The ecosystem supports data from a variety of formats, including, but not limited to, MSP, MGF, mzML, mzXML, netCDF as well as MassBank text files and SQL databases. Through its highly customizable functionality, the presented infrastructure allows to build reproducible annotation workflows tailored for and adapted to most untargeted LC-MS-based datasets. All core functionality, which supports base R data types, is exported, also facilitating its re-use in other R packages. Finally, all packages are thoroughly unit-tested and documented and are available on GitHub and through Bioconductor.
We present version 2 of the MSnbase R/Bioconductor package. MSnbase provides infrastructure for the manipulation, processing and visualisation of mass spectrometry data. We focus on the new on-disk infrastructure, that allows the handling of large raw mass spectrometry experiment on commodity hardware and illustrate how the package is used for elegant data processing, method development and visualisation.
MALDIquant and associated R packages provide a versatile and completely free open-source platform for analyzing 2D mass spectrometry data as generated for instance by MALDI and SELDI instruments. We first describe the various methods and algorithms available in MALDIquant. Subsequently, we illustrate a typical analysis workflow using MALDIquant by investigating an experimental cancer data set, starting from raw mass spectrometry measurements and ending at multivariate classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.