Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of 2006
DOI: 10.3115/1220835.1220849
|View full text |Cite
|
Sign up to set email alerts
|

Alignment by agreement

Abstract: We present an unsupervised approach to symmetric word alignment in which two simple asymmetric models are trained jointly to maximize a combination of data likelihood and agreement between the models.Compared to the standard practice of intersecting predictions of independently-trained models, joint training provides a 32% reduction in AER. Moreover, a simple and efficient pair of HMM aligners provides a 29% reduction in AER over symmetrized IBM model 4 predictions.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
242
0
2

Year Published

2010
2010
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 240 publications
(247 citation statements)
references
References 12 publications
3
242
0
2
Order By: Relevance
“…The idea of bidirectional translation is also not unique to CLIR. Machine translation researchers leverage a comparable idea ("alignment by agreement"), which is now available as a replacement for GIZA++ in the Berkeley Aligner (Liang et al, 2006)). Comparison of our implementations of IMM and DAMM with variants based on Berkeley alignment results would be a logical first next step towards understanding the potential of these alignments in CLIR applications…”
Section: Resultsmentioning
confidence: 99%
“…The idea of bidirectional translation is also not unique to CLIR. Machine translation researchers leverage a comparable idea ("alignment by agreement"), which is now available as a replacement for GIZA++ in the Berkeley Aligner (Liang et al, 2006)). Comparison of our implementations of IMM and DAMM with variants based on Berkeley alignment results would be a logical first next step towards understanding the potential of these alignments in CLIR applications…”
Section: Resultsmentioning
confidence: 99%
“…For building our AP E B2 system, we set a maximum phrase length of 7 for the translation model, and a 5-gram language model was trained using KenLM (Heafield, 2011). Word alignments between the mt and pe (4.5M synthetic mt-pe data + 12K WMT APE data) were established using the Berkeley Aligner (Liang et al, 2006), while word pairs from hybrid prior alignment (Section 2.1) between mt-pe (12K data) were used for the additional training data to build AP E B2 . The reordering model was trained with the hierarchical, monotone, swap, left to right bidirectional (hier-mslr-bidirectional) method (Galley and Manning, 2008) and conditioned on both the source and target language.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…The monolingual mt-pe parallel corpus is first word aligned using a hybrid word alignment method based on the alignment combination of three different statistical word alignment methods: (i) GIZA++ (Och, 2003) word alignment with grow-diag-final-and (GDFA) heuristic (Koehn, 2010), (ii) Berkeley word alignment (Liang et al, 2006), and (iii) SymGiza++ (Junczys-Dowmunt and Szał, 2012) word alignment, as well as two different edit distance based word aligners based on Translation Edit Rate (TER) (Snover et al, 2006) and METEOR (Lavie and Agarwal, 2007). We follow the alignment strategy described in (Pal et al, 2013;Pal et al, 2016a).…”
Section: Hybrid Prior Alignmentmentioning
confidence: 99%
“…Alternative approaches could be considered for some of the steps of our rule learning procedure in order to further improve the results obtained. The word alignment quality could be improved by integrating symmetrisation in the training of the alignment models as shown by Liang et al (2006), who have reported a reduction in the alignment error rate with small parallel corpora. Regarding the optimisation performed to discard rules that cause a deficient chunking of the sentences to be translated, some changes could be made to the evaluation metric used to compute the set of key text segments I; for instance, Nakov et al (2012) suggest some improvements to the BLEU smoothing, which are well-suited to sentence-level optimisation.…”
Section: Discussionmentioning
confidence: 99%