Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1100
|View full text |Cite
|
Sign up to set email alerts
|

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

Abstract: Nearly all previous work on neural machine translation (NMT) has used quite restricted vocabularies, perhaps with a subsequent method to patch in unknown words. This paper presents a novel wordcharacter solution to achieving open vocabulary NMT. We build hybrid systems that translate mostly at the word level and consult the character components for rare words. Our character-level recurrent neural networks compute source word representations and recover unknown target words when needed. The twofold advantage of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
249
0
1

Year Published

2017
2017
2019
2019

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 266 publications
(256 citation statements)
references
References 20 publications
0
249
0
1
Order By: Relevance
“…Chung et al [4] focus on handling translation at the level of characters without any word segmentation only on target side. Luong et al [13] propose a novel hybrid architecture that combines the strength of both word and character-based models. Sennrich et al [20] use BPE method to encode rare and unknown words as sequences of subword units.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Chung et al [4] focus on handling translation at the level of characters without any word segmentation only on target side. Luong et al [13] propose a novel hybrid architecture that combines the strength of both word and character-based models. Sennrich et al [20] use BPE method to encode rare and unknown words as sequences of subword units.…”
Section: Related Workmentioning
confidence: 99%
“…Most of these are below the word level, e.g. characters [4], hybrid word-characters [13,25], and more intelligent subwords [20,25]. Besides, pioneering studies [25,8] demonstrate that translation tasks involving Chinese are some of the most difficult problems in NMT systems.…”
Section: Introductionmentioning
confidence: 99%
“…Let us give just a few examples of usage: text classification [16], part-of-speech tagging [17,18], language modeling [19], sentiment analysis [20] or text normalization [21]. Recently, the concept of using subwords to form a representation appeared [22,23]. Another work [24] suggests to guide word-embeddings with morphologically annotated data and shows achievement using German in a case study.…”
Section: Introductionmentioning
confidence: 99%
“…This type of model needs no tokenization, freeing the system from one source of errors. Character-level neural models have been applied in several NLP tasks, ranging from relatively basic tasks such as text categorization and language modeling to complex prediction tasks such as translation (Luong and Manning, 2016;Sennrich et al, 2016).…”
Section: Introductionmentioning
confidence: 99%
“…This type of model needs no tokenization, freeing the system from one source of errors. Character-level neural models have been applied in several NLP tasks, ranging from relatively basic tasks such as text categorization and language modeling to complex prediction tasks such as translation (Luong and Manning, 2016;Sennrich et al, 2016).In particular, character-based neural models are attractive because they can take sub-word units, such as the morphology, into account. Morphological analysis and prediction models using character-based recurrent neural networks have recently become popular, as evidenced by their complete dominance at the SIGMORPHON shared task on morphological reinflection .…”
mentioning
confidence: 99%