We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography-mass spectrometry (GC-MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC-MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.Given its ease of use and low operational cost, GC-MS has applications with broad societal effect, such as detection of metabolic disease in newborns, toxicology, doping, forensics, food science and clinical testing. The predominant ionization technique in GC-MS is electron ionization (EI), in which all compounds are ionized by high-energy (70-eV) electrons. Because fragmentation occurs with ionization, EI GC-MS data are subjected to spectral deconvolution, a process that separates fragmentation ion patterns for each eluting molecule into a composite mass spectrum.The 70 eV for ionizing electrons in GC-MS has been the standard, making it possible to use decades-old EI reference spectra for annotation 1 . There are ~1.2 million reference spectra that have been accumulated and curated over a period of more than 50 years 2 . Many tools and repositories for GC-MS data have been introduced [3][4][5][6][7][8][9][10][11][12][13][14][15] ; however, much of GC-MS data processing is restricted to vendor-specific formats and software 8 . Currently, deconvolution requires setting multiple parameters manually [3][4][5] or posessing computational skills to run the software 7 . Also, the lack of data sharing in a uniform format precludes data comparison between laboratories and prevents taking advantage of repository-scale information and community knowledge, resulting in infrequent reuse of GC-MS data 8,[11][12][13][14][15] .Although batch modes exist, deconvolution quality is currently not enhanced by using information from all other files. To leverage across-file information, improve scalability of spectral deconvolution and eliminate the need for manually setting the deconvolution parameters (m/z error correction of the ions and peak shapeslopes of raising and trailing edges, peak RT shifts and noise/intensity thresholds), we developed an algorithmic learning strategy for auto-deconvolution (Fig. 1a-f). We deployed this functionality within GNPS/MassIVE (https://gnps.ucsd.edu) 16 (Fig. 1f-i). To promote analysis reproducibility, all GNPS jobs performed are retained in the 'My User' space and can be shared as hyperlinks.This user-independent 'automatic' parameter optimization is accomplished via fast Fourier transform (FFT), multiplication and inverse Fourier transform for each ion across an entire data set, followed by an unsupervised non-negative matrix factorization (NMF) (one-layer neural network). Then, the compositional consistency of spectral patterns for each spec...