2020
DOI: 10.21203/rs.3.rs-71854/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MolDiscovery: Learning Mass Spectrometry Fragmentation of Small Molecules

Abstract: Identification of small molecules is a critical task in various areas of life science. Recent advances in mass spectrometry have enabled the collection of tandem mass spectra of small molecules from hundreds of thousands of environments. To identify which molecules are present in a sample, one can search mass spectra collected from the sample against millions of molecular structures in small molecule databases. This is a challenging task as currently it is not clear how small molecules are fragmented in mass s… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…molDiscovery , is a mass spectral database search method that improves both efficiency and accuracy of small molecule identification by (i) using an efficient algorithm to generate mass spectrometry fragmentations, and (ii) learning a probabilistic model to match small molecules with their mass spectra (Mohimani et al 2020 ). A search of over 8 million spectra from the GNPS molecular networking infrastructure demonstrated that this probabilistic model can correctly identify nearly six times more unique compounds than other previously reported methods.…”
Section: Annotation Toolsmentioning
confidence: 99%
“…molDiscovery , is a mass spectral database search method that improves both efficiency and accuracy of small molecule identification by (i) using an efficient algorithm to generate mass spectrometry fragmentations, and (ii) learning a probabilistic model to match small molecules with their mass spectra (Mohimani et al 2020 ). A search of over 8 million spectra from the GNPS molecular networking infrastructure demonstrated that this probabilistic model can correctly identify nearly six times more unique compounds than other previously reported methods.…”
Section: Annotation Toolsmentioning
confidence: 99%
“…We conducted a paired analysis by collecting MS/MS spectra for extracts from strain cultures grown in ISP1 and R2A media. MS/MS spectra were queried using molDiscovery against a database of PRISM predicted chemical structures from BGCs (12). The database comprised 1,177 structures generated from 180 BGCs sequenced in this study.…”
Section: Insilco Prism-predicted Chemical Structuresmentioning
confidence: 99%
“…Similarity to characterized BGCs can also be used for strain dereplication. To assess this application on the current sequencing data, we screened MS/MS spectra for known natural products in the Natural Product Atlas database (3) (29,006 compounds) using molDiscovery (12). Subsequently, we analyzed genome sequences to confirm the presence of the corresponding BGCs.…”
Section: Known Metabolites and Their Bgcsmentioning
confidence: 99%
See 1 more Smart Citation