2016
DOI: 10.1017/s1351324916000115
|View full text |Cite
|
Sign up to set email alerts
|

Recent advances in machine translation using comparable corpora

Abstract: This paper highlights some of the recent developments in the field of machine translation using comparable corpora. We start by updating previous definitions of comparable corpora and then look at bilingual versions of continuous vector space models. Recently, neural networks have been used to obtain latent context representations with only few dimensions which are often called word embeddings. These promising new techniques cannot only be applied to parallel but also to comparable corpora. Subsequent sections… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 24 publications
0
4
0
1
Order By: Relevance
“…Biblical texts are usually well studied, so both Strong's numbers and morphological information are available for Hebrew and Greek texts. Automated glossing is also a widely studied area for other texts and languages (see Rapp, Sharoff, and Zweigenbaum 2016;McMillan-Major 2020;Zhao et al 2020). Much work in this area has been devoted to methods based on neural networks and word embeddings.…”
Section: Related Workmentioning
confidence: 99%
“…Biblical texts are usually well studied, so both Strong's numbers and morphological information are available for Hebrew and Greek texts. Automated glossing is also a widely studied area for other texts and languages (see Rapp, Sharoff, and Zweigenbaum 2016;McMillan-Major 2020;Zhao et al 2020). Much work in this area has been devoted to methods based on neural networks and word embeddings.…”
Section: Related Workmentioning
confidence: 99%
“…The use of information technologies and artificial intelligence in theoretical and applied linguistics is one of the most relevant and promising tracks of interdisciplinary research. Linguistic projects involving the use of computer technologies are proliferating (Alemi & Haeri 2020, Fuertes-Olivera et al 2016, Hirschberg & Manning 2015, Paris et al 2013, Rapp et al 2016. The creation of national corpora, participation of linguists in the development of artificial intelligence systems, the use of artificial intelligence in compiling dictionaries, the application of computers and robotics in language education were in the focus of the Summit.…”
Section: редакционная статьяmentioning
confidence: 99%
“…Использование современных компьютерных технологий и искусственного интеллекта в теоретической и прикладной лингвистике -одно из актуальных и перспективных областей междисциплинарных исследований (Alemi & Haeri 2020, Fuertes-Olivera et al 2016, Hirschberg & Manning 2015, Paris et al 2013, Rapp et al 2016. Вопросы проведения лингвистических исследований с применением компьютерных технологий, создание национальных корпусов, участие лингвистов в создании искусственного интеллекта, применение искусственного интеллекта в создании словарей, использование компьютеров и робототехники в образовании и обучении иностранным языкам -это лишь немногие вопросы, которые обсуждались на саммите.…”
Section: Ruunclassified
“…A review of the large body of research on mining parallel sentences in collections of monolingual texts from comparable corpora can be found at Schwenk (2019). The methodology used to extract a comparable corpus is different from that of a parallel one (see, for example, Hewavitharana & Vogel, 2016;Rapp et al, 2016).…”
Section: Introductionmentioning
confidence: 99%