2022
DOI: 10.1101/2022.08.06.503000
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Universal Language for Finding Mass Spectrometry Data Patterns

Abstract: Even though raw mass spectrometry data is information rich, the vast majority of the data is underutilized. The ability to interrogate these rich datasets is handicapped by the limited capability and flexibility of existing software. We introduce the Mass Spec Query Language (MassQL) that addresses these issues by enabling an expressive set of mass spectrometry patterns to be queried directly from raw data. MassQL is an open-source mass spectrometry query language for flexible and mass spectrometer manufacture… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
30
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

5
2

Authors

Journals

citations
Cited by 23 publications
(38 citation statements)
references
References 11 publications
1
30
0
Order By: Relevance
“…Of note, since all testing BGC-MS/MS links (in the validation set) are present in the MIBiG database, we used the information from this database to infer the biosynthetic class and substructure predictions in this particular analysis. However, the biosynthetic class can be predicted for unknown metabolites using MolNetEnhancer ( 34 ) and/or CANOPUS ( 35 ) and the substructure predictions for unknown metabolites can be obtained using tools like MS2LDA ( 36 ), MassQL ( 23 ), and/or CSI: FingerID/SIRIUS 4 ( 37 ). To obtain genomics-inferred substructure features for large paired omics datasets in the future, we envision that the recently proposed iPRESTO approach ( 38 ) could be applied to discover commonly occurring patterns of biosynthesis genes that together likely produce a substructure of a specialized metabolite.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Of note, since all testing BGC-MS/MS links (in the validation set) are present in the MIBiG database, we used the information from this database to infer the biosynthetic class and substructure predictions in this particular analysis. However, the biosynthetic class can be predicted for unknown metabolites using MolNetEnhancer ( 34 ) and/or CANOPUS ( 35 ) and the substructure predictions for unknown metabolites can be obtained using tools like MS2LDA ( 36 ), MassQL ( 23 ), and/or CSI: FingerID/SIRIUS 4 ( 37 ). To obtain genomics-inferred substructure features for large paired omics datasets in the future, we envision that the recently proposed iPRESTO approach ( 38 ) could be applied to discover commonly occurring patterns of biosynthesis genes that together likely produce a substructure of a specialized metabolite.…”
Section: Resultsmentioning
confidence: 99%
“…We believe that NPOmix will assist with the discovery of novel metabolites as well as known metabolites with new biosynthesis (more details in the “ Supplementary Material and Background ” section). We exemplify a computational method combining NPOmix and MassQL ( 23 ) for prioritizing siderophores from thousands of metabolome profiles and this method can be reproduced by the users with their own samples.…”
Section: Introductionmentioning
confidence: 99%
“…Deposition of untargeted MS data in the public domain is experiencing rapid growth, largely thanks to the increasing adoption of universal, nonvendor-specific MS data formats (e.g., mzML format). The two MS data mining tools described in this section, the Mass Spectrometry Search Tool (MASST) 138 and the Mass Spectrometry Query Language (MassQL), 139 have been recently developed by the Dorrestein lab with the aim of making the ever-growing untargeted MS data repositories (∼12,000 data sets comprising ∼7,500,000 files as of September 2022) an easily accessible resource to assist the annotation of unknown molecules and structural analogues.…”
Section: Emerging Tools For Epilipidomicsmentioning
confidence: 99%
“…MassQL is a novel query language for the mining of MS data. 139 Inspired by the SQL programming language, MassQL implements a consensus vocabulary to search for MS patterns using human-readable query strings. Searchable MS terms include both MS (e.g., precursor ion m / z , isotopic patterns) and MS/MS fragmentation patterns (e.g., diagnostic fragments and neutral losses), with support for both data-dependent (DDA) and data-independent acquisition (DIA).…”
Section: Emerging Tools For Epilipidomicsmentioning
confidence: 99%
“…The richness of plants with a plethora of bioactive metabolites motivated scientists to explore them with interesting chemistries. Consequently, new strategies for dereplication processes, involving mass spectrometry integrated with wide online databases such as PubChem, ChemSpider, and Global Natural Product Social Molecular Networking (GNPS), have attracted more attention to allow the rapid identification of bioactive metabolites in a complex extract [ 7 , 8 ]. Molecular Networking (MN) via GNPS has emerged as a new approach enabling metabolite annotation together with featuring discriminating components [ 6 , 9 ].…”
Section: Introductionmentioning
confidence: 99%