2015
DOI: 10.1109/taslp.2015.2425220
|View full text |Cite
|
Sign up to set email alerts
|

Bilingual Continuous-Space Language Model Growing for Statistical Machine Translation

Abstract: Larger -gram language models (LMs) perform better in statistical machine translation (SMT). However, the existing approaches have two main drawbacks for constructing larger LMs: 1) it is not convenient to obtain larger corpora in the same domain as the bilingual parallel corpora in SMT; 2) most of the previous studies focus on monolingual information from the target corpora only, and redundant -grams have not been fully utilized in SMT. Nowadays, continuous-space language model (CSLM), especially neural networ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
7
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 26 publications
(11 citation statements)
references
References 35 publications
(86 reference statements)
0
11
0
Order By: Relevance
“…Hence, the features are expressed in a meaningful way. The n-gram technique is implemented to capture the semantic relationship and calculate the frequency of the feature order [37]. The n-gram model is truncated as k −1 and characterizes the sequence of i features into unigram, bigram, and trigram.…”
Section: Methodsmentioning
confidence: 99%
“…Hence, the features are expressed in a meaningful way. The n-gram technique is implemented to capture the semantic relationship and calculate the frequency of the feature order [37]. The n-gram model is truncated as k −1 and characterizes the sequence of i features into unigram, bigram, and trigram.…”
Section: Methodsmentioning
confidence: 99%
“…Word2vec uses word vector presentation mode based on Distributed representation. Distributed representation is proposed by Hinton in 1986 [28]. Its basic thought is to map each word into a -dimension real vector by training ( is a hyperparameter in the model) and to judge the semantic similarity between them according to the distance between words (such as cosine similarity, Euclidean distance).…”
Section: Word2vecmentioning
confidence: 99%
“…On the other hand, the statistical approach [3]- [6] is datadriven, requiring only a large bilingual corpus. This means that linguists are not required to develop the translation rules.…”
Section: Introductionmentioning
confidence: 99%