Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations

Chen

2020 25th International Conference on Pattern Recognition (ICPR)

et al. 2021

A novel sentence embedding method built upon semantic subspace analysis, called semantic subspace sentence embedding (S3E), is proposed in this work. Given the fact that word embeddings can capture semantic relationship while semantically similar words tend to form semantic groups in a high-dimensional embedding space, we develop a sentence representation scheme by analyzing semantic subspaces of its constituent words. Specifically, we construct a sentence model from two aspects. First, we represent words that lie in the same semantic group using the intra-group descriptor. Second, we characterize the interaction between multiple semantic groups with the inter-group descriptor. The proposed S3E method is evaluated on both textual similarity tasks and supervised tasks. Experimental results show that it offers comparable or better performance than the state-of-the-art. The complexity of our S3E method is also much lower than other parameterized models.

Section: Methodsmentioning

confidence: 99%

Section: B Supervised Tasksmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Sentence Embedding via Semantic Subspace Analysis

Chen

2020 25th International Conference on Pattern Recognition (ICPR)

et al. 2021

“…Regarding the former, large RNNs are by far the most popular (Conneau et al, 2017;Kiros et al, 2015;Tang et al, 2017;Nie et al, 2017;Hill et al, 2016;McCann et al, 2017;Peters et al, 2018;Logeswaran & Lee, 2018), followed by convolutional neural networks (Gan et al, 2017). A third group are efficient methods that aggregate word embeddings (Wieting et al, 2016;Arora et al, 2017;Pagliardini et al, 2018;Rücklé et al, 2018). Most of the methods in the latter group are word order agnostic.…”

Section: Related Workmentioning

confidence: 99%

“…Despite CBOW's simplicity, it attains strong results on many downstream tasks. Using sophisticated weighting schemes, the performance of aggregated word embeddings can be further increased (Arora et al, 2017), coming even close to strong LSTM baselines (Rücklé et al, 2018;Henao et al, 2018) such as InferSent (Conneau et al, 2017). This raises the question how much benefit recurrent encoders actually provide over simple word embedding based methods (Wieting & Kiela, 2019).…”

Section: Introductionmentioning

confidence: 99%

CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model

Mai¹,

Galke²,

Scherp³

2019

Preprint

Continuous Bag of Words (CBOW) is a powerful text embedding method. Due to its strong capabilities to encode word content, CBOW embeddings perform well on a wide range of downstream tasks while being efficient to compute. However, CBOW is not capable of capturing the word order. The reason is that the computation of CBOW's word embeddings is commutative, i.e., embeddings of XYZ and ZYX are the same. In order to address this shortcoming, we propose a learning algorithm for the Continuous Matrix Space Model (Rudolph & Giesbrecht, 2010), which we call Continual Multiplication of Words (CMOW). Our algorithm is an adaptation of word2vec (Mikolov et al., 2013a), so that it can be trained on large quantities of unlabeled text. We empirically show that CMOW better captures linguistic properties, but it is inferior to CBOW in memorizing word content. Motivated by these findings, we propose a hybrid model that combines the strengths of CBOW and CMOW. Our results show that the hybrid CBOW-CMOW-model retains CBOW's strong ability to memorize word content while at the same time substantially improving its ability to encode other linguistic information by 8%. As a result, the hybrid also performs better on 8 out of 11 supervised downstream tasks with an average improvement of 1.2%.

Lecture Notes in Computer Science

Evaluation of Sentence Embedding Models for Natural Language Understanding Problems in Russian

Попов

Pugachev

Svyatokum

et al. 2019

We investigate the performance of sentence embeddings models on several tasks for the Russian language. In our comparison, we include such tasks as multiple choice question answering, next sentence prediction, and paraphrase identification. We employ FastText embeddings as a baseline and compare it to ELMo and BERT embeddings. We conduct two series of experiments, using both unsupervised (i.e., based on similarity measure only) and supervised approaches for the tasks. Finally, we present datasets for multiple choice question answering and next sentence prediction in Russian.