Untargeted metabolomics aims to gather information on as many metabolites as possible in biological systems by taking into account all information present in the data sets. Here we describe a detailed protocol for large-scale untargeted metabolomics of plant tissues, based on reversed phase liquid chromatography coupled to high-resolution mass spectrometry (LC-QTOF MS) of aqueous methanol extracts. Dedicated software, MetAlign, is used for automated baseline correction and alignment of all extracted mass peaks across all samples, producing detailed information on the relative abundance of thousands of mass signals representing hundreds of metabolites. Subsequent statistics and bioinformatics tools can be used to provide a detailed view on the differences and similarities between (groups of) samples or to link metabolomics data to other systems biology information, genetic markers and/or specific quality parameters. The complete procedure from metabolite extraction to assembly of a data matrix with aligned mass signal intensities takes about 6 days for 50 samples.
Hyphenated full-scan MS technology creates large amounts of data. A versatile easy to handle automation tool aiding in the data analysis is very important in handling such a data stream. MetAlign softwareas described in this manuscripthandles a broad range of accurate mass and nominal mass GC/MS and LC/MS data. It is capable of automatic format conversions, accurate mass calculations, baseline corrections, peak-picking, saturation and mass-peak artifact filtering, as well as alignment of up to 1000 data sets. A 100 to 1000-fold data reduction is achieved. MetAlign software output is compatible with most multivariate statistics programs.
Variation for metabolite composition and content is often observed in plants. However, it is poorly understood to what extent this variation has a genetic basis. Here, we describe the genetic analysis of natural variation in the metabolite composition in Arabidopsis thaliana. Instead of focusing on specific metabolites, we have applied empirical untargeted metabolomics using liquid chromatography-time of flight mass spectrometry (LC-QTOF MS). This uncovered many qualitative and quantitative differences in metabolite accumulation between A. thaliana accessions. Only 13.4% of the mass peaks were detected in all 14 accessions analyzed. Quantitative trait locus (QTL) analysis of more than 2,000 mass peaks, detected in a recombinant inbred line (RIL) population derived from the two most divergent accessions, enabled the identification of QTLs for about 75% of the mass signals. More than one-third of the signals were not detected in either parent, indicating the large potential for modification of metabolic composition through classical breeding.Metabolites are critical in biology, and plants are especially rich in diverse biochemical compounds. It has been estimated that over 100,000 metabolites can be found in plants, and each species may contain its own chemotypic expression pattern 1 . Moreover, substantial quantitative and qualitative variation in metabolite composition is often observed within plant species 2 .Although knowledge on the regulation of metabolite formation is increasing, for thousands of metabolites, their function in the plant, their biosynthetic pathway and the regulation thereof is still unknown. QTL analysis of natural variation, which can affect metabolites 3 , in segregating populations can identify loci explaining the observed variation 4 . In recent years, a few studies have focused on identifying QTLs regulating a specific group of known metabolites using detection methods directed toward specific metabolite groups 5-9 . However, recent advances in mass spectrometry-based metabolomics and data processing techniques should now allow large-scale QTL analyses of untargeted metabolic profiles, which may uncover previously unknown regulatory functions of loci in metabolic pathways. Using dedicated alignment software, it is now possible to perform an unbiased comparison of large numbers of metabolite-derived masses detectable in large numbers of samples arising from inherently large sets of genotypes (which are required for accurate mapping of QTLs) in an RIL population 10,11 . QTL mapping will result in the localization of loci, and ultimately genes, causal for the observed variation and will allow the discovery of coregulated compounds. In this way, genomewide genetic correlative metabolic analysis now becomes feasible, as we demonstrate here. RESULTS Metabolite variation is abundant and genetically controlledTo assess the natural variation in metabolite content present in A. thaliana, we performed HPLC-QTOF MS-based untargeted metabolic fingerprinting of acidified aqueous methanol extracts fr...
To take full advantage of the power of functional genomics technologies and in particular those for metabolomics, both the analytical approach and the strategy chosen for data analysis need to be as unbiased and comprehensive as possible. Existing approaches to analyze metabolomic data still do not allow a fast and unbiased comparative analysis of the metabolic composition of the hundreds of genotypes that are often the target of modern investigations. We have now developed a novel strategy to analyze such metabolomic data. This approach consists of (1) full mass spectral alignment of gas chromatography (GC)-mass spectrometry (MS) metabolic profiles using the MetAlign software package, (2) followed by multivariate comparative analysis of metabolic phenotypes at the level of individual molecular fragments, and (3) multivariate mass spectral reconstruction, a method allowing metabolite discrimination, recognition, and identification. This approach has allowed a fast and unbiased comparative multivariate analysis of the volatile metabolite composition of ripe fruits of 94 tomato (Lycopersicon esculentum Mill.) genotypes, based on intensity patterns of .20,000 individual molecular fragments throughout 198 GC-MS datasets. Variation in metabolite composition, both between-and within-fruit types, was found and the discriminative metabolites were revealed. In the entire genotype set, a total of 322 different compounds could be distinguished using multivariate mass spectral reconstruction. A hierarchical cluster analysis of these metabolites resulted in clustering of structurally related metabolites derived from the same biochemical precursors. The approach chosen will further enhance the comprehensiveness of GC-MS-based metabolomics approaches and will therefore prove a useful addition to nontargeted functional genomics research.
Plant-specific N-glycosylation can represent an important limitation for the use of recombinant glycoproteins of mammalian origin produced by transgenic plants. Comparison of plant and mammalian N-glycan biosynthesis indicates that 1,4-galactosyltransferase is the most important enzyme that is missing for conversion of typical plant N-glycans into mammalian-like N-glycans. Here, the stable expression of human 1,4-galactosyltransferase in tobacco plants is described. Proteins isolated from transgenic tobacco plants expressing the mammalian enzyme bear N-glycans, of which about 15% exhibit terminal 1,4-galactose residues in addition to the specific plant N-glycan epitopes. The results indicate that the human enzyme is fully functional and localizes correctly in the Golgi apparatus. Despite the fact that through the modified glycosylation machinery numerous proteins have acquired unusual N-glycans with terminal 1,4-galactose residues, no obvious changes in the physiology of the transgenic plants are observed, and the feature is inheritable. The crossing of a tobacco plant expressing human 1,4-galactosyltransferase with a plant expressing the heavy and light chains of a mouse antibody results in the expression of a plantibody that exhibits partially galactosylated N-glycans (30%), which is approximately as abundant as when the same antibody is produced by hybridoma cells. These results are a major step in the in planta engineering of the N-glycosylation of recombinant antibodies.
Application of C18 monolithic silica capillary columns in HPLC coupled to ion trap mass spectrometry detection was studied for probing the metabolome of the model plant Arabidopsis thaliana. It could be shown that the use of a long capillary column is an easy and effective approach to reduce ionization suppression by enhanced chromatographic resolution. Several hundred peaks could be detected using a 90-cm capillary column for LC separation and a noise reduction and automatic peak alignment software, which outperformed manual inspection or commercially available mass spectral deconvolution software.
IntroductionBatch effects in large untargeted metabolomics experiments are almost unavoidable, especially when sensitive detection techniques like mass spectrometry (MS) are employed. In order to obtain peak intensities that are comparable across all batches, corrections need to be performed. Since non-detects, i.e., signals with an intensity too low to be detected with certainty, are common in metabolomics studies, the batch correction methods need to take these into account. ObjectivesThis paper aims to compare several batch correction methods, and investigates the effect of different strategies for handling non-detects.MethodsBatch correction methods usually consist of regression models, possibly also accounting for trends within batches. To fit these models quality control samples (QCs), injected at regular intervals, can be used. Also study samples can be used, provided that the injection order is properly randomized. Normalization methods, not using information on batch labels or injection order, can correct for batch effects as well. Introducing two easy-to-use quality criteria, we assess the merits of these batch correction strategies using three large LC–MS and GC–MS data sets of samples from Arabidopsis thaliana.ResultsThe three data sets have very different characteristics, leading to clearly distinct behaviour of the batch correction strategies studied. Explicit inclusion of information on batch and injection order in general leads to very good corrections; when enough QCs are available, also general normalization approaches perform well. Several approaches are shown to be able to handle non-detects—replacing them with very small numbers such as zero seems the worst of the approaches considered.ConclusionThe use of quality control samples for batch correction leads to good results when enough QCs are available. If an experiment is properly set up, batch correction using the study samples usually leads to a similar high-quality correction, but has the advantage that more metabolites are corrected. The strategy for handling non-detects is important: choosing small values like zero can lead to suboptimal batch corrections.
Summary• Overall metabolic modifications between fruit of light-hyperresponsive highpigment ( hp ) tomato ( Lycopersicon esculentum ) mutant plants and isogenic nonmutant (wt) control plants were compared.• Targeted metabolite analyses, as well as large-scale nontargeted mass spectrometry (MS)-based metabolite profiling, were used to phenotype the differences in fruit metabolite composition.• Targeted high-performance liquid chromatography with photodiode array detection (HPLC-PDA) metabolite analyses showed higher levels of isoprenoids and phenolic compounds in hp-2 dg fruit. Nontargeted GC-MS profiling of red fruits produced 25 volatile compounds that showed a 1.5-fold difference between the genotypes. Analyses of red fruits using HPLC coupled to high-resolution quadrupole time-offlight mass spectrometry (LC-QTOF-MS) in both ESI-positive and ESI-negative mode generated, respectively, 6168 and 5401 mass signals, of which 142 and 303 showed a twofold difference between the genotypes.• hp-2 dg fruits are characterized by overproduction of many metabolites, several of which are known for their antioxidant or photoprotective activities. These metabolites may now be more closely implicated as resources recruited by plants to respond to and manage light stress. The similarity in metabolic alterations in fruits of hp-1 and hp-2 mutant plants helps us to understand how hp mutations affect cellular processes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.