The interpretation of nuclear magnetic resonance (NMR) experimental results for metabolomics studies requires intensive signal processing and multivariate data analysis techniques. A key step in this process is the quantification of spectral features, which is commonly accomplished by dividing an NMR spectrum into several hundred integral regions or bins. Binning attempts to minimize effects from variations in peak positions caused by sample pH, ionic strength, and composition, while reducing the dimensionality for multivariate statistical analyses. Herein we develop an improved novel spectral quantification technique, dynamic adaptive binning. With this technique, bin boundaries are determined by optimizing an objective function using a dynamic programming strategy. The objective function measures the quality of a bin configuration based on the number of peaks per bin. This technique shows a significant improvement over both traditional uniform binning and other adaptive binning techniques. This improvement is quantified via synthetic validation sets by analyzing an algorithm's ability to create bins that do not contain more than a single peak and that maximize the distance from peak to bin boundary. The validation sets are developed by characterizing the salient distributions in experimental NMR spectroscopic data. Further, dynamic adaptive binning is applied to a 1 H NMRbased experiment to monitor rat urinary metabolites to empirically demonstrate improved spectral quantification.
The goal of this study was to determine if fecal metabolite and microbiota profiles can serve as biomarkers of human intestinal diseases, and to uncover possible gut microbe-metabolite associations. We employed proton nuclear magnetic resonance to measure fecal metabolites of healthy children and those diagnosed with diarrhea-predominant irritable bowel syndrome (IBS-D). Metabolite levels were associated with fecal microbial abundances. Using several ordination techniques, healthy and irritable bowel syndrome (IBS) samples could be distinguished based on the metabolite profiles of fecal samples, and such partitioning was congruent with the microbiota-based sample separation. Measurements of individual metabolites indicated that the intestinal environment in IBS-D was characterized by increased proteolysis, incomplete anaerobic fermentation and possible change in methane production. By correlating metabolite levels with abundances of microbial genera, a number of statistically significant metabolite-genus associations were detected in stools of healthy children. No such associations were evident for IBS children. This finding complemented the previously observed reduction in the number of microbe-microbe associations in the distal gut of the same cohort of IBS-D children.
For most prokaryotic organisms, amino acid biosynthesis represents a significant portion of their overall energy budget. The difference in the cost of synthesis between amino acids can be striking, differing by as much as 7-fold. Two prokaryotic organisms, Escherichia coli and Bacillus subtilis, have been shown to preferentially utilize less costly amino acids in highly expressed genes, indicating that parsimony in amino acid selection may confer a selective advantage for prokaryotes. This study confirms those findings and extends them to 4 additional prokaryotic organisms: Chlamydia trachomatis, Chlamydophila pneumoniae AR39, Synechocystis sp. PCC 6803, and Thermus thermophilus HB27. Adherence to codon-usage biases for each of these 6 organisms is inversely correlated with a coding region's average amino acid biosynthetic cost in a fashion that is independent of chemoheterotrophic, photoautotrophic, or thermophilic lifestyle. The obligate parasites C. trachomatis and C. pneumoniae AR39 are incapable of synthesizing many of the 20 common amino acids. Removing auxotrophic amino acids from consideration in these organisms does not alter the overall trend of preferential use of energetically inexpensive amino acids in highly expressed genes.
Prokaryotic organisms preferentially utilize less energetically costly amino acids in highly expressed genes. Studies have shown that the proteome of Saccharomyces cerevisiae also exhibits this behavior, but only in broad terms. This study examines the question of metabolic efficiency as a proteome-shaping force at a finer scale, examining whether trends consistent with cost minimization as an evolutionary force are present independent of protein function and amino acid physicochemical property, and consistently with respect to amino acid biosynthetic costs. Inverse correlations between the average amino acid biosynthetic cost of the protein product and the levels of gene expression in S. cerevisiae are consistent with natural selection to minimize costs. There are, however, patterns of amino acid usage that raise questions about the strength (and possibly the universality) of this selective force in shaping S. cerevisiae's proteome.
In many metabolomics studies, NMR spectra are divided into bins of fixed width. This spectral quantification technique, known as uniform binning, is used to reduce the number of variables for pattern recognition techniques and to mitigate effects from variations in peak positions; however, shifts in peaks near the boundaries can cause dramatic quantitative changes in adjacent bins due to non-overlapping boundaries. Here we describe a new Gaussian binning method that incorporates overlapping bins to minimize these effects. A Gaussian kernel weights the signal contribution relative to distance from bin center, and the overlap between bins is controlled by the kernel standard deviation. Sensitivity to peak shift was assessed for a series of test spectra where the offset frequency was incremented in 0.5 Hz steps. For a 4 Hz shift within a bin width of 24 Hz, the error for uniform binning increased by 150%, while the error for Gaussian binning increased by 50%. Further, using a urinary metabolomics data set (from a toxicity study) and principal component analysis (PCA), we showed that the information content in the quantified features was equivalent for Gaussian and uniform binning methods. The separation between groups in the PCA scores plot, measured by the J 2 quality metric, is as good or better for Gaussian binning versus uniform binning. The Gaussian method is shown to be robust in regards to peak shift, while still retaining the information needed by classification and multivariate statistical techniques for NMR-metabolomics data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.