Siamese CBOW: Optimizing Word Embeddings for Sentence Representations

Kenter, Tom; Borisov, Alexey V.; Rijke, Maarten de

doi:10.48550/arxiv.1606.04640

Cited by 19 publications

(24 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The dense vector representation of text can be constructed by the ensemble of word embeddings (Mikolov et al, 2013) in the text. The Siamese CBOW model (Kenter et al, 2016) constructs the sentence embedding by averaging the word embeddings and uses the embedding similarities among the sentence, its adjacent sentences and randomly chosen sentences as training target to fine tune the sentence embedding. The Word Mover's Distance (WMD) (Kusner et al, 2015) measures the similarity of two texts by calculating the minimum accumulate distance from all the embedded words in one text to the embedded words in the other text.…”

Section: Related Workmentioning

confidence: 99%

An end-to-end Neural Network Framework for Text Clustering

Zhou,

Cheng,

Zhang

2019

Preprint

View full text Add to dashboard Cite

The unsupervised text clustering is one of the major tasks in natural language processing (NLP) and remains a difficult and complex problem. Conventional methods generally treat this task using separated steps, including text representation learning and clustering the representations. As an improvement, neural methods have also been introduced for continuous representation learning to address the sparsity problem. However, the multi-step process still deviates from the unified optimization target. Especially the second step of cluster is generally performed with conventional methods such as k-Means. We propose a pure neural framework for text clustering in an end-to-end manner. It jointly learns the text representation and the clustering model. Our model works well when the context can be obtained, which is nearly always the case in the field of NLP. We have our method evaluated on two widely used benchmarks: IMDB movie reviews for sentiment classification and 20-Newsgroup for topic categorization. Despite its simplicity, experiments show the model outperforms previous clustering methods by a large margin. Furthermore, the model is also verified on English wiki dataset as a large corpus.

show abstract

Section: Related Workmentioning

confidence: 99%

An end-to-end Neural Network Framework for Text Clustering

Zhou,

Cheng,

Zhang

2019

Preprint

View full text Add to dashboard Cite

show abstract

“…Siamese C-BOW (Kenter et al, 2016) shares a common concept with SIF and Sent2vec: defining a sentence vector as the average of word embedding vectors.…”

Section: Related Workmentioning

confidence: 99%

Sentence transition matrix: An efficient approach that preserves sentence semantics

Jang¹,

Kang²

2019

Preprint

View full text Add to dashboard Cite

Sentence embedding is a significant research topic in the field of natural language processing (NLP). Generating sentence embedding vectors reflecting the intrinsic meaning of a sentence is a key factor to achieve an enhanced performance in various NLP tasks such as sentence classification and document summarization. Therefore, various sentence embedding models based on supervised and unsupervised learning have been proposed after the advent of researches regarding the distributed representation of words. They were evaluated through semantic textual similarity (STS) tasks, which measure the degree of semantic preservation of a sentence and neural network-based supervised embedding models generally yielded state-of-the-art performance. However, these models have a limitation in that they have multiple parameters to update, thereby requiring a tremendous amount of labeled training data. In this study, we propose an efficient approach that learns a transition matrix that refines a sentence embedding vector to reflect the latent semantic meaning of a sentence. The proposed method has two practical advantages;(1) it can be applied to any sentence embedding method, and (2) it can achieve robust performance in STS tasks irrespective of the number of training examples.

show abstract

“…Assembling successful distributional word representations (for example, GloVe (Pennington et al, 2014)) into sentence representations is an active research topic. Different from previous studies (for example, doc2vec (Mikolov et al, 2013), skip-thought vectors , Siamese CBOW (Kenter et al, 2016)), our main contribution is to represent sentences using non-vector space representations: a sentence can be well represented by the subspace spanned by the context word vectors -such a method naturally builds on any word representation method. Due to the widespread use of word2vec and GloVe, we use their publicly available word representations -word2vec (Mikolov et al, 2013) trained using Google News 1 and GloVe (Pennington et al, 2014) trained using Common Crawl 2 -to test our observations.…”

Section: Geometry Of Sentencesmentioning

confidence: 99%

“…A sentence contains rich syntactic information and can be modeled through sophisticated neural networks (e.g., convolutional neural networks (Kim, 2014;Kalchbrenner et al, 2014), recurrent neural networks (Sutskever et al, 2014;Le and Mikolov, 2014;Hill et al, 2016) and recursive neural networks (Socher et al, 2013)). Another simple and common approach ignores the latent structure of sentences: a prototypical approach is to represent a sentence by summing or averaging over the vectors of the words in this sentence (Wieting et al, 2015;Adi et al, 2016;Kenter et al, 2016). Recently, Wieting et al (2015); Adi et al (2016) reveal that even though the latter approach ignores all syntactic information, it is simple, straightforward, and remarkably robust at capturing the sentential semantics.…”

Section: Introductionmentioning

confidence: 99%

Representing Sentences as Low-Rank Subspaces

Mu¹,

Bhat²,

Viswanath³

2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 2: Short Papers)

View full text Add to dashboard Cite

Sentences are important semantic units of natural language. A generic, distributional representation of sentences that can capture the latent semantics is beneficial to multiple downstream applications. We observe a simple geometry of sentencesthe word representations of a given sentence (on average 10.23 words in all SemEval datasets with a standard deviation 4.84) roughly lie in a low-rank subspace (roughly, rank 4). Motivated by this observation, we represent a sentence by the low-rank subspace spanned by its word vectors. Such an unsupervised representation is empirically validated via semantic textual similarity tasks on 19 different datasets, where it outperforms the sophisticated neural network models, including skip-thought vectors, by 15% on average.

show abstract

Siamese CBOW: Optimizing Word Embeddings for Sentence Representations

Cited by 19 publications

References 0 publications

An end-to-end Neural Network Framework for Text Clustering

An end-to-end Neural Network Framework for Text Clustering

Sentence transition matrix: An efficient approach that preserves sentence semantics

Representing Sentences as Low-Rank Subspaces

Contact Info

Product

Resources

About