Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.296
|View full text |Cite
|
Sign up to set email alerts
|

Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?

Abstract: Automatic identification of multiword expressions (MWEs), like to cut corners 'to do an incomplete job ', is a pre-requisite for semantically-oriented downstream applications. This task is challenging because MWEs, especially verbal ones (VMWEs), exhibit surface variability. This paper deals with a subproblem of VMWE identification: the identification of occurrences of previously seen VMWEs. A simple language-independent system based on a combination of filters competes with the best systems from a recent shar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(16 citation statements)
references
References 9 publications
(14 reference statements)
0
10
0
Order By: Relevance
“…Evaluations related to MWEs have often focused on specific tasks (Constant et al, 2017) like, for example, on MWE token identification (Scholivet and Ramisch, 2017;Pasquer et al, 2020), evaluating the sensitivity of a model to the occurrence of MWEs in sentences or looking at their interpretation (Cordeiro et al, 2019;Garcia et al, 2021a), assessing to what extent models are able to represent their potential idiomaticity (Tayyar Madabushi et al, 2021, 2022. In particular the results obtained for MWE interpretation by a variety of off-the-shelf embeddings with different levels of contextualisation (including transformer-based models) indicated that idiomaticity was not yet accurately captured even when adopting large pre-trained language models, for any of the two languages covered, English and Portuguese (Garcia et al, 2021b,a).…”
Section: Related Workmentioning
confidence: 99%
“…Evaluations related to MWEs have often focused on specific tasks (Constant et al, 2017) like, for example, on MWE token identification (Scholivet and Ramisch, 2017;Pasquer et al, 2020), evaluating the sensitivity of a model to the occurrence of MWEs in sentences or looking at their interpretation (Cordeiro et al, 2019;Garcia et al, 2021a), assessing to what extent models are able to represent their potential idiomaticity (Tayyar Madabushi et al, 2021, 2022. In particular the results obtained for MWE interpretation by a variety of off-the-shelf embeddings with different levels of contextualisation (including transformer-based models) indicated that idiomaticity was not yet accurately captured even when adopting large pre-trained language models, for any of the two languages covered, English and Portuguese (Garcia et al, 2021b,a).…”
Section: Related Workmentioning
confidence: 99%
“…Figure 1 shows the distribution of papers across the 24 languages considered by our paper sample. The reasons that lead to choosing a given corpus and/or set of languages in non-ST works are various: language diversity (Zampieri et al, 2019), corpus domain (Liu et al, 2021), and corpus quality and size (Pasquer et al, 2020b).…”
Section: Corpus Constitution and Selectionmentioning
confidence: 99%
“…Even though this study focused on L1 and L2 students, it supports the investigation of "MWE across proficiency levels or novice and professional academic writing [...] in an academic context." [11, p.11] The ongoing variability of MWE, particularly verbal ones, and the challenges it poses for automated machine learning identification tasks are recognized by [13]. The authors focused on strategies to improve the identification of verbal MWE (VMWE) and used the PARSEME 7 corpora as a starting point.…”
Section: Related Workmentioning
confidence: 99%
“…The authors focused on strategies to improve the identification of verbal MWE (VMWE) and used the PARSEME 7 corpora as a starting point. They completed three tasks comprised of a training and development phase, a prediction phase, and an evaluation phase all aided by a simple [candidate VMWE potential] extraction of filtering (Seen2020) techniques for precision [13]. Promising results were evidenced in boosting the identification of global MWE by using a combination of morphosyntactic filters.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation