Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.47
|View full text |Cite
|
Sign up to set email alerts
|

Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries

Abstract: We propose a new task, Text2Mol, to retrieve molecules using natural language descriptions as queries. Natural language and molecules encode information in very different ways, which leads to the exciting but challenging problem of integrating these two very different modalities. Although some work has been done on text-based retrieval and structurebased retrieval, this new task requires integrating molecules and natural language more directly. Moreover, this can be viewed as an especially challenging cross-li… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
58
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 41 publications
(80 citation statements)
references
References 43 publications
0
58
0
Order By: Relevance
“…Text Molecule Generation. Text2Mol [189] is a cross-modal information retrieval system to retrieve molecule graph based on language description. A BERT-based text encoder and a MLP-GCN combined molecule encoder are utilized to create multi-modal embedding in a semantic space, which is aligned by contrast learning with paired data.…”
Section: Text Graph Generationmentioning
confidence: 99%
See 1 more Smart Citation
“…Text Molecule Generation. Text2Mol [189] is a cross-modal information retrieval system to retrieve molecule graph based on language description. A BERT-based text encoder and a MLP-GCN combined molecule encoder are utilized to create multi-modal embedding in a semantic space, which is aligned by contrast learning with paired data.…”
Section: Text Graph Generationmentioning
confidence: 99%
“…Year Method Task Code 2016 Li et al [165] Text-to-KG Generation -2019 KG-BERT [166] Text-to-KG Generation https://github.com/yao8839836/kg-bert 2020 Malaviya et al [167] Text-to-KG Generation https://github.com/allenai/commonsense-kg-completion 2019 Petroni et al citepetroni2019language Text-to-KG Generation https://github.com/facebookresearch/LAMA 2020 Shin et al citeshin2020autoprompt Text-to-KG Generation https://github.com/ucinlp/autoprompt 2021 Li et al citeli2021prefix Text-to-KG Generation https://github.com/XiangLi1999/PrefixTuning 2022 Lu et al [173] Text-to-KG Generation https://github.com/universal-ie/UIE 2022 Grapher [174] Text-to-KG Generation https://github.com/ibm/grapher 2020 CycleGT [171] Text-KG Generation https://github.com/QipengGuo/CycleGT 2020 DualTKB [172] Text-KG Generation -2018 GTR-LSTM [176] KG-to-Text Generation -2018 Song et al [177] KG-to-Text Generation https://github.com/freesunshine0316/neural-graph-to-seq-mp 2020 DUALENC [175] KG-to-Text Generation https://github.com/zhaochaocs/DualEnc 2019 Koncel-Kedziorski et al [178] KG-to-Text Generation https://github.com/rikdz/GraphWriter 2020 Ribeiro et al [180] KG-to-Text Generation https://github.com/UKPLab/kg2text 2020 HetGT [181] KG-to-Text Generation https://github.com/QAQ-v/HetGT 2016 Dong et al [182] Semantic Parsing https://github.com/donglixp/lang2logic 2016 Jia et al [183] Semantic Parsing https://worksheets.codalab.org/... 2018 Lyu et al [184] Semantic Parsing https://github.com/...PREDICTION 2018 Chen et al [185] Semantic Parsing https://github.com/dongpobeyond/Seq2Act 2019 Zhang et al [186] Semantic Parsing https://github.com/sheng-z/stog 2019 Fancellu et al [187] Semantic Parsing -2021 Text2Mol [189] Text-Molecule Generation https://github.com/cnedwards/text2mol 2022 MolT5 [190] Text-Molecule Generation https://github.com/blender-nlp/MolT5 2022 MoMu [188] Text-Molecule Generation https://github.com/bingsu12/momu Table 6. Major text graph models.…”
Section: A Curated Advances In Generative Aimentioning
confidence: 99%
“…prediction (Schwaller et al, 2019), conditional compound generation (Born et al, 2021b;a), retrosynthesis (Schwaller et al, 2020), text-conditional de novo generation (Edwards et al, 2021), molecule generation (Born and Manica, 2023), protein structure prediction (Jumper et al, 2021), among others. By interpreting chemistry as a programmable language for life sciences, transformer-based models are revolutionizing the chemical discovery pipeline, significantly speeding up laboratory and design automation (O'Neill, 2021;Vaucher et al, 2020), and paving the way for an age of accelerated discovery in science and engineering.…”
Section: Introductionmentioning
confidence: 99%
“…When multiple domains are considered, e.g., generating a novel molecule from its technical description in natural language, merging information is challenging due to the domain shift between language and chemistry. Current solutions often involve pre-training the model on large, single-domain datasets and fine-tuning on each task (Edwards et al, 2021), resulting in high computational expense, sample inefficiency, and the need to repeat this process for each use-case.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation