Attention-Based Multimodal Entity Linking with High-Quality Images

Zhang, Li; Li, Zhixu; Yang, Qiang

doi:10.1007/978-3-030-73197-7_35

Cited by 19 publications

(27 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To alleviate the long-term dependency problem and extract worthy information, some DL based EL systems leverage BERT [83] to encode the entity description in order to learn the entity embedding. Specifically, several EL works [67], [68], [69], [77] took the entity description as a sequence of words as input to BERT. Most of these works [68], [69], [77] inserted a special start token [CLS] at the beginning of the input sequence and the output of the last layer at this start token produced by the Transformer encoder is regarded as the vector representation of the input sequence.…”

Section: Entity Descriptionmentioning

confidence: 99%

“…Specifically, several EL works [67], [68], [69], [77] took the entity description as a sequence of words as input to BERT. Most of these works [68], [69], [77] inserted a special start token [CLS] at the beginning of the input sequence and the output of the last layer at this start token produced by the Transformer encoder is regarded as the vector representation of the input sequence. Fang et al [67] obtained the entity embedding via average-pooling over the hidden states of all description tokens in the last BERT layer.…”

Section: Entity Descriptionmentioning

confidence: 99%

“…Several DL based EL models [43], [46], [51], [55], [60], [63], [70] assumed that a context word is important if it is strongly related to at least one candidate entity, so they selected the embedding of the most related candidate entity for each context word as the attention vector. Additionally, Zhang et al [77] regarded the representations of connected entities of the candidate entity introduced in Section 4.3.3 as the attention vectors.…”

Section: Context Embeddingmentioning

confidence: 99%

“…U (•) is a function used to determine the attention weight of each value by measuring the similarity or correlation between the value and the attention vector. Cao et al [53] used the cosine similarity, Mueller and Durrett [50] and Zhang et al [77] leveraged the dot product, and a great many attention-based models [31], [43], [46], [51], [55], [60], [63], [70] utilized the bilinear similarity. We will introduce these three metrics in detail in Section 5.4.2 as they are also utilized to calculate the context similarity feature.…”

Section: Context Embeddingmentioning

confidence: 99%

“…BERT [83] is a widely used Transformers-based pre-trained language model in context embedding learning for EL task. Several DL based EL methods [67], [68], [77] took the wordpieces of the entity mention and its context as the input of BERT. Wu et al [68] and Zhang et al [77] inserted a start token [CLS] and regarded the output of the last layer at this start token as the context embedding, while Fang et al [67] obtained the context embedding via average-pooling over the hidden states of all context tokens in the last layer.…”

Section: Context Embeddingmentioning

confidence: 99%

See 4 more Smart Citations

Entity Linking Meets Deep Learning: Techniques and Solutions

Shen¹,

Li²,

Liu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Entity linking (EL) is the process of linking entity mentions appearing in web text with their corresponding entities in a knowledge base. EL plays an important role in the fields of knowledge engineering and data mining, underlying a variety of downstream applications such as knowledge base population, content analysis, relation extraction, and question answering. In recent years, deep learning (DL), which has achieved tremendous success in various domains, has also been leveraged in EL methods to surpass traditional machine learning based methods and yield the state-of-the-art performance. In this survey, we present a comprehensive review and analysis of existing DL based EL methods. First of all, we propose a new taxonomy, which organizes existing DL based EL methods using three axes: embedding, feature, and algorithm. Then we systematically survey the representative EL methods along the three axes of the taxonomy. Later, we introduce ten commonly used EL data sets and give a quantitative performance analysis of DL based EL methods over these data sets. Finally, we discuss the remaining limitations of existing methods and highlight some promising future directions.

show abstract

Section: Entity Descriptionmentioning

confidence: 99%

Section: Entity Descriptionmentioning

confidence: 99%

Section: Context Embeddingmentioning

confidence: 99%

Section: Context Embeddingmentioning

confidence: 99%

Section: Context Embeddingmentioning

confidence: 99%

See 3 more Smart Citations

Entity Linking Meets Deep Learning: Techniques and Solutions

Shen¹,

Li²,

Liu³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Multimodal Entity Linking with Mixed Fusion Mechanism

Zhang

Jiang

Guan

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Nucleic Acids Analysis

Zhao

Zuo

et al. 2020

Sci. China Chem.

View full text Add to dashboard Cite

In recent years, multi-modal entity linking (MEL) has garnered increasing attention in the research community due to its significance in numerous multi-modal applications. Video, as a popular means of information transmission, has become prevalent in people's daily lives. However, most existing MEL methods primarily focus on linking textual and visual mentions or offline videos's mentions to entities in multi-modal knowledge bases, with limited efforts devoted to linking mentions within online video content. In this paper, we propose a task called Online Video Entity Linking (OVEL), aiming to establish connections between mentions in online videos and a knowledge base with high accuracy and timeliness. To facilitate the research works of OVEL, we specifically concentrate on live delivery scenarios and construct a live delivery entity linking dataset called LIVE. Besides, we propose an evaluation metric that considers timelessness, robustness, and accuracy. Furthermore, to effectively handle OVEL task, we leverage a memory block managed by a Large Language Model and retrieve entity candidates from the knowledge base to augment LLM performance on memory management. The experimental results prove the effectiveness and efficiency of our method.

show abstract

Attention-Based Multimodal Entity Linking with High-Quality Images

Cited by 19 publications

References 23 publications

Entity Linking Meets Deep Learning: Techniques and Solutions

Entity Linking Meets Deep Learning: Techniques and Solutions

Multimodal Entity Linking with Mixed Fusion Mechanism

Nucleic Acids Analysis

Contact Info

Product

Resources

About