Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval 2019
DOI: 10.1145/3342827.3342846
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of Morphological Embeddings for the Russian Language

Abstract: This paper evaluates morphology-based embeddings for English and Russian languages. Despite the interest and introduction of several morphology-based word embedding models in the past and acclaimed performance improvements on word similarity and language modeling tasks, in our experiments, we did not observe any stable preference over two of our baseline models -SkipGram and FastText. The performance exhibited by morphological embeddings is the average of the two baselines mentioned above.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…This fine-tuning allowed to not only preserve the features of both corpora but to also use available pre-trained models with the minimum changes required. Even though multiple experiments with the various types of word embeddings for inflected and agglutinative languages (Üstün et al, 2018;Romanov and Khusainova, 2019) have shown that morphological subword embeddings perform very well with the Slavic languages, these types of distributed word representations are still limited due to its static nature i.e., inability to change depending on the context. However, more information extraction opportunities occurred with the introduction of the most modern type of deep contextualized word embedding, such as ELMo (Peters et al, 2018), BERT (Devlin et al, 2019).…”
Section: Techniquesmentioning
confidence: 99%
“…This fine-tuning allowed to not only preserve the features of both corpora but to also use available pre-trained models with the minimum changes required. Even though multiple experiments with the various types of word embeddings for inflected and agglutinative languages (Üstün et al, 2018;Romanov and Khusainova, 2019) have shown that morphological subword embeddings perform very well with the Slavic languages, these types of distributed word representations are still limited due to its static nature i.e., inability to change depending on the context. However, more information extraction opportunities occurred with the introduction of the most modern type of deep contextualized word embedding, such as ELMo (Peters et al, 2018), BERT (Devlin et al, 2019).…”
Section: Techniquesmentioning
confidence: 99%
“…Moreover, the mentioned language processing tasks affect the performance of ML algorithms [25][26][27]. In this context, BERT may seem to be a solution to comprehend contextual meaning of morphologically complicated words [28][29][30] without mentioned language processing tasks. This is one of the first motivations of this study.…”
Section: Introductionmentioning
confidence: 99%