A Survey of Text Similarity Approaches

Gomaa, Wael H.; Fahmy, Aly A.

doi:10.5120/11638-7118

Cited by 560 publications

(320 citation statements)

References 25 publications

Supporting

Mentioning

282

Contrasting

Unclassified

Order By: Relevance

“…Text similarity measurement is a text mining approach that could be overcome this overwhelming problem. Finding the similarity between words is a primary stage for sentence, paragraph and document similarities [2]. Text similarity approach may alleviate people on finding relevant information.…”

Section: Introductionmentioning

confidence: 99%

“…Lexical and semantic similarity words is an essential element of sentence, paragraph and document similarity measurement [2]. Lexical similarity a degree of two given string are similar in its character sequence.…”

Section: Introductionmentioning

confidence: 99%

“…Gomaa [2] explained the three main categories of text similarity approach, but did not discuss about the evaluation of algorithms performance. This paper will survey the measurement approaches of lexically and semantically text similarities from the widely used to the recent issues.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

The performance of text similarity algorithms

Prasetya

Wibawa

Hirashima

2018

Int. J. Adv. Intell. Informatics

View full text Add to dashboard Cite

Text similarity measurement compares text with available references to indicate the degree of similarity between those objects. There have been many studies of text similarity and resulting in various approaches and algorithms. This paper investigates four majors text similarity measurements which include String-based, Corpus-based, Knowledge-based, and Hybrid similarities. The results of the investigation showed that the semantic similarity approach is more rational in finding substantial relationship between texts.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

The performance of text similarity algorithms

Prasetya

Wibawa

Hirashima

2018

Int. J. Adv. Intell. Informatics

View full text Add to dashboard Cite

show abstract

“…Text similarity measures have been widely used in several natural language processing applications such as automatic essay grading, paraphrase recognition, etc [1][2][3]. Previous studies on text similarity were mostly concerned about the semantic typing in terms of two mechanisms: the detection of similarity and difference in the form of judgments of likeness in which other potential inconsistency that can be resulted from judgments of difference.…”

Section: Introductionmentioning

confidence: 99%

Noun phrase based weghting scheme for sentence similarity measurement

Mahmood

Kamaruddin

Naser

2018

J. Fundam and Appl Sci.

View full text Add to dashboard Cite

show abstract

“…Sentence textual similarity is a crucial and a prerequisite subtask for many text processing and NLP tasks including text summarization, document classification, text clustering, topic detection, automatic question answering, automatic text scoring, plagiarism detection, machine translation, conversational agents among others (Ali, Ghosh, & Al-Mamun, 2009;Gomaa & Fahmy, 2013;Haque, Naskar, Way, Costa-Jussà, & Banchs, 2010;K. O'Shea, 2012;Osman, Salim, Binwahlan, Alteeb, & Abuobieda, 2012).…”

Section: Introductionmentioning

confidence: 99%

Proceedings of the First AHA!-Workshop on Information Discovery in Text

Akbik¹,

Visengeriyeva²

2014

View full text Add to dashboard Cite

ii IntroductionWelcome to the First AHA!-Workshop on Information Discovery in Text! In this workshop, we are bringing together leading researchers in the emerging field of Information Discovery to discuss approaches for Information Extraction that are not bound by a pre-specified schema of information, but rather discover relational or categorial structure automatically from given unstructured data.This includes approaches that are based on unsupervised machine-learning over models of distributional semantics, as well as OpenIE methods that relax the definition of semantic relations in order to more openly extract structured information. Other approaches focus on inexpensively training information extractors to be used across different domains with minimal supervision, or on adapting existing IE systems to new domains and relations. We received 19 paper submissions of which the programme committee has accepted ten -six of which were chosen for oral presentation and four as posters.We look forward to a workshop full of interesting paper presentations, invited talks and lively discussion. AbstractRecent approaches to relation extraction following the distant supervision paradigm have focused on exploiting large knowledge bases, from which they extract substantial amount of supervision. However, for many relations in real-world applications, there are few instances available to seed the relation extraction process, and appropriate named entity recognizers which are necessary for pre-processing do not exist. To overcome this issue, we learn entity filters jointly with relation extraction using imitation learning. We evaluate our approach on architect names and building completion years, using only around 30 seed instances for each relation and show that the jointly learned entity filters improved the performance by 30 and 7 points in average precision.

show abstract

A Survey of Text Similarity Approaches

Cited by 560 publications

References 25 publications

The performance of text similarity algorithms

The performance of text similarity algorithms

Noun phrase based weghting scheme for sentence similarity measurement

Proceedings of the First AHA!-Workshop on Information Discovery in Text

Contact Info

Product

Resources

About