Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-1134
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Frame Identification with Multilingual Evaluation

Abstract: An essential step in FrameNet Semantic Role Labeling is the Frame Identification (FrameId) task, which aims at disambiguating a situation around a predicate. Whilst current FrameId methods rely on textual representations only, we hypothesize that FrameId can profit from a richer understanding of the situational context. Such contextual information can be obtained from common sense knowledge, which is more present in images than in text. In this paper, we extend a state-of-the-art FrameId system in order to eff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(20 citation statements)
references
References 31 publications
(35 reference statements)
0
20
0
Order By: Relevance
“…Our model underperforms compared to other embedding frameworks from Hermann et al (2014) and Botschen et al (2018), which can be explained through an examination of the input representation methods used by the different models, as well as their disambiguation strategies. The model by Hermann et al (2014) constructs an input representation that encodes the syntactic dependency relations found within the predicate context by concatenating the embeddings for the arguments and learning a mapping to a lowerdimensional space.…”
Section: Resultsmentioning
confidence: 80%
See 1 more Smart Citation
“…Our model underperforms compared to other embedding frameworks from Hermann et al (2014) and Botschen et al (2018), which can be explained through an examination of the input representation methods used by the different models, as well as their disambiguation strategies. The model by Hermann et al (2014) constructs an input representation that encodes the syntactic dependency relations found within the predicate context by concatenating the embeddings for the arguments and learning a mapping to a lowerdimensional space.…”
Section: Resultsmentioning
confidence: 80%
“…The Botschen et al (2018) model is most significantly different from ours in two respects: it uses multimodal embedding representations at the input (textual + visual), and it employs a softmax classifier at the output step, whereas we use MSE as a loss function. Prior work has shown that the first option is more powerful in the context of word sense disambiguation tasks (Popov, 2017).…”
Section: Resultsmentioning
confidence: 99%
“…Work on event semantics hints at two annotation types complementing each other: additional information about participants benefits event prediction (Ahrendt and Demberg, 2016;Botschen et al, 2018) and context information about events benefits the prediction of implicit arguments and entities (Cheng and Erk, 2018). The complementarity is further affirmed by efforts on aligning WD and the FN lexicon: the best alignment approach only maps 37% of the total WD properties to frames (Mousselly-Sergieh and Gurevych, 2016).…”
Section: Complementarity Of Annotationsmentioning
confidence: 99%
“…Both annotation tools, the WD entity linker as well as the FN frame identifier, introduce some noise: for the entity linker, Sorokin and Gurevych (2018) report 0.73 F-score and the frame identifier has an accuracy of 0.89 (Botschen et al, 2018). We perform a manual error analysis on 50 instances of the test set to understand the effect of the noisy WD annotation.…”
Section: Error Analysismentioning
confidence: 99%
See 1 more Smart Citation