We report a novel peak sorting method for the two-dimensional gas chromatography/time-of-flight mass spectrometry (GC×GC/TOF-MS) system. The objective of peak sorting is to recognize peaks from the same metabolite occurring in different samples from thousands of peaks detected in the analytical procedure. The developed algorithm is based on the fact that the chromatographic peaks for a given analyte have similar retention times in all of the chromatograms. Raw instrument data are first processed by ChromaTOF (Leco) software to provide the peak tables. Our algorithm achieves peak sorting by utilizing the first and second dimension retention times in the peak tables and the mass spectra generated during the process of electron impact ionization. The algorithm searches the peak tables for the peaks generated by the same type of metabolite using several search criteria. Our software also includes options to eliminate non-target peaks from the sorting results, e.g., peaks of contaminants. The developed software package has been tested using a mixture of standard metabolites and another mixture of standard metabolites spiked into human serum. Manual validation demonstrates high accuracy of peak sorting with this algorithm.
Proteomics is a still-evolving combination of technologies to describe and characterize all expressed proteins in a biological system. Because of upper limits on mass detection of mass spectrometers, the bottom-up approach is most widely employed in which tryptic peptides are quantified and identified from complex protein mixtures. Protein identification from tandem mass spectra is still a challenge in proteomics. Two approaches have been developed to identify proteins from tandem mass spectra, database searching and de novo sequencing. These approaches typically have positive identification rates of only ~10-20%, and exhibit high false positive identification rates. This review surveys existing algorithms developed for database searching and de novo sequencing, with a focus on recent developments for tandem mass spectrum quality assessment, peptide identification using annotated spectra libraries, statistical approaches to assess identification quality, and methods for constrained searches. We also review research comparing the performance of existing protein identification packages.
In this paper, a generalized Brain-State-in-a-Box (gBSB)-based hybrid neural network is proposed for storing and retrieving pattern sequences. The hybrid network consists of autoassociative and heteroassociative parts. Then, a large-scale image storage and retrieval neural system is constructed using the gBSB-based hybrid neural network and the pattern decomposition concept. The notion of the deadbeat stability is employed to describe the stability property of the vertices of the hypercube to which the trajectories of the gBSB neural system are constrained. Extensive simulations of large scale pattern and image storing and retrieval are presented to illustrate the results obtained.
Systems biology aims to understand biological systems on a comprehensive scale, such that the components that make up the whole are connected to one another and work in harmony. As a major component of systems biology, differential proteomics studies the differences between distinct but related proteomes such as normal versus diseased cells and diseased versus treated cells. High throughput mass spectrometry (MS) based analytical platforms are widely used in differential proteomics (Domon, 2006; Fenselau, 2007). As a common practice, the proteome is usually digested into peptides first. The peptide mixture is then separated using multidimensional liquid chromatography (MDLC) and is finally subjected to MS for further analysis. Thousands of mass spectra are generated in a single experiment. Discovering the significantly changed proteins from millions of peaks involves mass informatics. This paper introduces data mining steps used in mass informatics, and concludes with a descriptive examination of concepts, trends and challenges in this rapidly expanding field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.