Proceedings of the Workshop on Multiword Expressions Identification, Interpretation, Disambiguation and Applications - MWE '09 2009
DOI: 10.3115/1698239.1698245
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting translational correspondences for pattern-independent MWE identification

Abstract: Based on a study of verb translations in the Europarl corpus, we argue that a wide range of MWE patterns can be identified in translations that exhibit a correspondence between a single lexical item in the source language and a group of lexical items in the target language. We show that these correspondences can be reliably detected on dependency-parsed, word-aligned sentences. We propose an extraction method that combines word alignment with syntactic filters and is independent of the structural pattern of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0
1

Year Published

2012
2012
2018
2018

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 10 publications
0
11
0
1
Order By: Relevance
“…If two or more words in a source language are aligned to the same word on the target side, the source is likely an MWE (Caseli et al, 2010). Conversely, one can assume that some types of MWEs such as verbnoun combinations tend to be translated as MWEs with the same syntactic structure, using aligned dependency-parsed corpora for discovery (Zarrieß and Kuhn, 2009). Instead of focusing on 1-tomany alignments, Tsvetkov and Wintner (2010) propose a method which incrementally removes from parallel sentences word pairs that are surely not MWEs.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…If two or more words in a source language are aligned to the same word on the target side, the source is likely an MWE (Caseli et al, 2010). Conversely, one can assume that some types of MWEs such as verbnoun combinations tend to be translated as MWEs with the same syntactic structure, using aligned dependency-parsed corpora for discovery (Zarrieß and Kuhn, 2009). Instead of focusing on 1-tomany alignments, Tsvetkov and Wintner (2010) propose a method which incrementally removes from parallel sentences word pairs that are surely not MWEs.…”
Section: Related Workmentioning
confidence: 99%
“…In the last 20 years, multilingual discovery of MWEs has gained some popularity thanks to the widespread use of statistical machine translation (MT), automatic word alignment tools and freely available parallel corpora (Zarrieß and Kuhn, 2009;Attia et al, 2010;Caseli et al, 2010). MWEs tend to be non compositional or show some kind of lexicosyntactic inflexibility, which is often reflected in translation asymmetries (Manning and Schütze, 1999).…”
Section: Introductionmentioning
confidence: 99%
“…Moreover, their method was able to extract possible translations for the MWEs in question, thus providing an efficient way to improve the coverage of bilingual lexical resources. Zarriess and Kuhn (2009) had previously argued that MWE patterns could be identified from one-to-many alignments in bilingual corpora in conjunction with syntactic filters. Caseli et al (2010) draw on a previous study by Villada Moirón and Tiedemann (2006), who extract MWE candidates using association measures and head dependence heuristics before using alignment for ranking purposes.…”
Section: Introductionmentioning
confidence: 99%
“…This method is also applied to the pediatrics domain (Caseli et al, 2009). Zarrieß and Kuhn (2009) argue that multiword expressions can be reliably detected in parallel corpora by using dependency-parsed, word-aligned sentences. For one-to-many translation pairs, they apply a generate-and-filter strategy: first, aligned syntactic configurations are generated, which are then filtered and post-edited.…”
Section: Parallel Corpora In Identifying Multiword Expressionsmentioning
confidence: 99%
“…For instance, one-to-many alignment can be exploited: if a word corresponds to several words in the other language, it is highly probable that the other language equivalent can be considered as a multiword expression (see e.g. Caseli et al (2009), Caseli et al (2010, Zarrieß and Kuhn (2009), Sinha (2009), Attia et al (2010 or Haugereid and Bond (2011)). However, this method cannot identify multiword expressions that are aligned to another multiword expression in the other language.…”
Section: Related Work On the Automatic Identification Of Multiword Exmentioning
confidence: 99%