2020
DOI: 10.31341/jios.44.2.2
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Approaches for Measuring the Semantic Similarity of Short Texts Based on Word Embeddings

Abstract: Measuring the semantic similarity of texts has a vital role in various tasks from the field of natural language processing. In this paper, we describe a set of experiments we carried out to evaluate and compare the performance of different approaches for measuring the semantic similarity of short texts. We perform a comparison of four models based on word embeddings: two variants of Word2Vec (one based on Word2Vec trained on a specific dataset and the second extending it with embeddings of word senses), FastTe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 30 publications
0
3
0
Order By: Relevance
“…A drawback of word-embedding approaches, however, is that all information about word order is lost. Nonetheless, surprisingly competitive results are obtained in many applications by aggregating word vectors despite this limitation (Babić et al, 2019;Kenter & de Rijke, 2015;Sinoara et al, 2019). Today, bidirectional encoder representations from transformers (BERT; Devlin et al, 2018) enable a new generation of technologies (generally referred to as transformers) to directly provide a representation of a sentence considering words in their context.…”
Section: Te X T Representation: From Words To Vectorsmentioning
confidence: 99%
“…A drawback of word-embedding approaches, however, is that all information about word order is lost. Nonetheless, surprisingly competitive results are obtained in many applications by aggregating word vectors despite this limitation (Babić et al, 2019;Kenter & de Rijke, 2015;Sinoara et al, 2019). Today, bidirectional encoder representations from transformers (BERT; Devlin et al, 2018) enable a new generation of technologies (generally referred to as transformers) to directly provide a representation of a sentence considering words in their context.…”
Section: Te X T Representation: From Words To Vectorsmentioning
confidence: 99%
“…Some DL models do not generate a vector presentation of documents or larger text units but only vector presentations of words. For these models, it was necessary to create documents vectors from words vectors, for example, in the form of centroids like the ones used in (Babi c et al, 2019;Babi c, Guerra, et al, 2020). Since no method for combining word vectors into document vectors existing today is generally accepted as the best one, and since several methods of combining have been tried, the one that has brought the best results has been chosen.…”
Section: Experiments Setupmentioning
confidence: 99%
“…The initial phase of neural language models, inspired by their work, featured shallow models. These models showcased the effectiveness of neural text representations through attributes such as lower-dimensional vector representations and the direct calculation of word similarity [10]. Moreover, leveraging embeddings as input led to enhanced performance across various NLP tasks [11], [12].…”
Section: Introductionmentioning
confidence: 98%