2022
DOI: 10.1007/978-3-031-00129-1_24
|View full text |Cite
|
Sign up to set email alerts
|

PromptMNER: Prompt-Based Entity-Related Visual Clue Extraction and Integration for Multimodal Named Entity Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 15 publications
0
4
0
Order By: Relevance
“…The first group of methods includes BiLSTM-CRF (Huang et al, 2015), BERT-CRF (Devlin et al, 2018) as well as the span-based NER models (e.g., BERT-span, RoBERTa-span (Yamada et al, 2020)), which only consider original text. The second group of methods includes several latest multimodal approaches for MNER task: UMT (Yu et al, 2020), UMGF , MNER-QG (Jia et al, 2022), R-GCN , ITA (Wang et al, 2021a), PromptMNER (Wang et al, 2022b), CAT-MNER (Wang et al, 2022c) and MoRe (Wang et al, 2022a), which consider both text and corresponding images.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The first group of methods includes BiLSTM-CRF (Huang et al, 2015), BERT-CRF (Devlin et al, 2018) as well as the span-based NER models (e.g., BERT-span, RoBERTa-span (Yamada et al, 2020)), which only consider original text. The second group of methods includes several latest multimodal approaches for MNER task: UMT (Yu et al, 2020), UMGF , MNER-QG (Jia et al, 2022), R-GCN , ITA (Wang et al, 2021a), PromptMNER (Wang et al, 2022b), CAT-MNER (Wang et al, 2022c) and MoRe (Wang et al, 2022a), which consider both text and corresponding images.…”
Section: Resultsmentioning
confidence: 99%
“…The version of ChatGPT used in experiments is gpt-3.5-turbo and sampling temperature is set to 0. For a fair comparison, PGIM chooses to use the same text encoder XLM-RoBERTa large (Conneau et al, 2019) as ITA (Wang et al, 2021a), PromptM-NER (Wang et al, 2022b), CAT-MNER (Wang et al, 2022c) and MoRe (Wang et al, 2022a).…”
Section: Stage-2 Entity Prediction Based On Auxiliary Refined Knowledgementioning
confidence: 99%
“…There are some other approaches that do not directly use the visual information from the images, but they open the new paths to mine the hidden information behind the image. (Wang et al 2022b) designs several prompt templates for each image to bridge the gap…”
Section: Related Workmentioning
confidence: 99%
“…The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) Wang et al 2022b;Jia et al 2023), which is used to guide words to get the expanded visual semantic information.…”
Section: Introductionmentioning
confidence: 99%