Mass spectrometry (MS) is used to quantify the relative distribution of glycans attached to particular protein glycosylation sites (micro-heterogeneity) and evaluate the molar site occupancy (macro-heterogeneity) in glycoproteomics. However, the accuracy of MS for such quantitative measurements remains to be clarified. As a key step towards this goal, a panel of related tryptic peptides with and without complex, biantennary, disialylated N-glycans was chemically synthesised by solid-phase peptide synthesis. Peptides mimicking those resulting from enzymatic deglycosylation using PNGase F/A and endo D/F/H were synthetically produced, carrying aspartic acid and N-acetylglucosamine-linked asparagine residues, respectively, at the glycosylation site. The MS ionisation/detection strengths of these pure, well-defined and quantified compounds were investigated using various MS ionisation techniques and mass analysers (ESI-IT, ESI-Q-TOF, MALDI-TOF, ESI/MALDI-FT-ICR-MS). Depending on the ion source/mass analyser, glycopeptides carrying complex-type N-glycans exhibited clearly lower signal strengths (10-50% of an unglycosylated peptide) when equimolar amounts were analysed. Less ionisation/detection bias was observed when the glycopeptides were analysed by nano-ESI and medium-pressure MALDI. The position of the glycosylation site within the tryptic peptides also influenced the signal response, in particular if detected as singly or doubly charged signals. This is the first study to systematically and quantitatively address and determine MS glycopeptide ionisation/detection strengths to evaluate glycoprotein micro-heterogeneity and macro-heterogeneity by label-free approaches. These data form a much needed knowledge base for accurate quantitative glycoproteomics.
The biological and clinical relevance of glycosylation is becoming increasingly recognized, leading to a growing interest in large-scale clinical and population-based studies. In the past few years, several methods for high-throughput analysis of glycans have been developed, but thorough validation and standardization of these methods is required before significant resources are invested in large-scale studies. In this study, we compared liquid chromatography, capillary gel electrophoresis, and two MS methods for quantitative profiling of N-glycosylation of IgG in the same data set of 1201 individuals. To evaluate the accuracy of the four methods we then performed analysis of association with genetic polymorphisms and age. Chromatographic methods with either fluorescent or MS-detection yielded slightly stronger associations than MS-only and multiplexed capillary gel electrophoresis, but at the expense of lower levels of throughput. Advantages and disadvantages of each method were identified, which should inform the selection of the most appropriate method in future studies.
BackgroundElucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification.ResultsHere, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a “merged” database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields.ConclusionsThis study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.Electronic supplementary materialThe online version of this article (doi:10.1186/s40168-016-0196-8) contains supplementary material, which is available to authorized users.
The enormous challenges of mass spectrometry-based metaproteomics are primarily related to the analysis and interpretation of the acquired data. This includes reliable identification of mass spectra and the meaningful integration of taxonomic and functional meta-information from samples containing hundreds of unknown species. To ease these difficulties, we developed a dedicated software suite, the MetaProteomeAnalyzer, an intuitive open-source tool for metaproteomics data analysis and interpretation, which includes multiple search engines and the feature to decrease data redundancy by grouping protein hits to so-called meta-proteins. We also designed a graph database back-end for the MetaProteomeAnalyzer to allow seamless analysis of results. The functionality of the MetaProteomeAnalyzer is demonstrated using a sample of a microbial community taken from a biogas plant.
High-throughput methods for oligosaccharide analysis are required when searching for glycan-based biomarkers. Next to mass spectrometry-based methods, which allow fast and reproducible analysis of such compounds, further separation-based techniques are needed, which allow for quantitative analysis. Here, an optimized sample preparation method for N-glycan-profiling by multiplexed capillary gel electrophoresis with laser-induced fluorescence detection (CGE-LIF) was developed, enabling high-throughput glycosylation analysis. First, glycans are released enzymatically from denatured plasma glycoproteins. Second, glycans are labeled with APTS using 2-picoline borane as a nontoxic and efficient reducing agent. Reaction conditions are optimized for a high labeling efficiency, short handling times, and only limited loss of sialic acids. Third, samples are subjected to hydrophilic interaction chromatography (HILIC) purification at the 96-well plate format. Subsequently, purified APTS-labeled N-glycans are analyzed by CGE-LIF using a 48-capillary DNA sequencer. The method was found to be robust and suitable for high-throughput glycan analysis. Even though the method comprises two overnight incubations, 96 samples can be analyzed with an overall labor allocation time of 2.5 h. The method was applied to serum samples from a pregnant woman, which were sampled during first, second, and third trimesters of pregnancy, as well as 6 weeks, 3 months, and 6 months postpartum. Alterations in the glycosylation patterns were observed with gestation and time after delivery.
The flow field dynamics in open and packed segments of capillary columns has been studied by a direct motion encoding of the fluid molecules using pulsed magnetic field gradient nuclear magnetic resonance. This noninvasive method operates within a time window that allows a quantitative discrimination of electroosmotic against pressure-driven flow behavior. The inherent axial fluid flow field dispersion and characteristic length scales of either transport mode are addressed, and the results demonstrate a significant performance advantage of an electrokinetically driven mobile phase in both open-tubular and packed-bed geometries. In contrast to the parabolic velocity profile and its impact on axial dispersion characterizing laminar flow through an open cylindrical capillary, a pluglike velocity distribution of the electroosmotic flow field is revealed in capillary electrophoresis. Here, the variance of the radially averaged, axial displacement probability distributions is quantitatively explained by longitudinal molecular diffusion at the actual buffer temperature, while for Poiseuille flow, the preasymptotic regime to Taylor-Aris dispersion can be shown. Compared to creeping laminar flow through a packed bed, the increased efficiency observed in capillary electrochromatography is related to the superior characteristics of the electroosmotic flow profile over any length scale in the interstitial pore space and to the origin, spatial dimension, and hydrodynamics of the stagnant fluid on the support particles' external surface. Using the Knox equation to analyze the axial plate height data, an eddy dispersion term smaller by a factor of almost 2.5 than in capillary high-performance liquid chromatography is revealed for the electroosmotic flow field in the same column.
Metaproteomic research involves various computational challenges during the identification of fragmentation spectra acquired from the proteome of a complex microbiome. These issues are manifold and range from the construction of customized sequence databases, the optimal setting of search parameters to limitations in the identification search algorithms themselves. In order to assess the importance of these individual factors, we studied the effect of strategies to combine different search algorithms, explored the influence of chosen database search settings, and investigated the impact of the size of the protein sequence database used for identification. Furthermore, we applied de novo sequencing as a complementary approach to classic database searching. All evaluations were performed on a human intestinal metaproteome dataset. Pyrococcus furiosus proteome data were used to contrast database searching of metaproteomic data to a classic proteomic experiment. Searching against subsets of metaproteome databases and the use of multiple search engines increased the number of identifications. The integration of P. furiosus sequences in a metaproteomic sequence database showcased the limitation of the target-decoy-controlled false discovery rate approach in combination with large sequence databases. The selection of varying search engine parameters and the application of de novo sequencing represented useful methods to increase the reliability of the results. Based on our findings, we provide recommendations for the data analysis that help researchers to establish or improve analysis workflows in metaproteomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.