Mining Paraphrases from Self-anchored Web Sentence Fragments

Paşca, Marius

doi:10.1007/11564126_22

Cited by 4 publications

(4 citation statements)

References 15 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ganitkevitch et al (2013) use the bilingual pivoting technique (Bannard and Callison-Burch 2005) along with distributional similarity features to extract lexical, and phrasal paraphrases. Some other approaches (Paşca 2005;Lin and Pantel 2001;Berant et al 2012) differ from ours in that, they use manually coded linguistic patterns to align only specific text fragment contexts to generate paraphrases (Paşca 2005), and require language specific resources such as part-of-speech taggers (Paşca 2005) and parsers (Lin and Pantel 2001). Furthermore, the latter two only find alternate constructions with the same content words, such as "X manufactures Y" infers "X's Y factory" (Lin and Pantel 2001).…”

Section: Multi-word Phrasesmentioning

confidence: 95%

“…The Near Synonym System (NeSS) introduces a new method which differs from other approaches in that it does not require parallel resources, (unlike Barzilay and McKeown 2001;Lin et al 2003;Callison-Burch et al 2006;Ganitkevitch et al 2013) nor does it use pre-determined sets of manually coded patterns (Lin et al 2003;Paşca, 2005). NeSS captures semantic similarity via n-gram distributional methods that implicitly preserve local syntactic structure without parsing, making the underlying method language independent.…”

Section: Ness: Near-synonym Systemmentioning

confidence: 99%

See 1 more Smart Citation

Unsupervised Phrasal Near-Synonym Generation from Text Corpora

Gupta

Carbonell

Gershman

et al. 2015

AAAI

View full text Add to dashboard Cite

Unsupervised discovery of synonymous phrases is useful in a variety of tasks ranging from text mining and search engines to semantic analysis and machine translation. This paper presents an unsupervised corpus-based conditional model: Near-Synonym System (NeSS) for finding phrasal synonyms and near synonyms that requires only a large monolingual corpus. The method is based on maximizing information-theoretic combinations of shared contexts and is parallelizable for large-scale processing. An evaluation framework with crowd-sourced judgments is proposed and results are compared with alternate methods, demonstrating considerably superior results to the literature and to thesaurus look up for multi-word phrases. Moreover, the results show that the statistical scoring functions and overall scalability of the system are more important than language specific NLP tools. The method is language-independent and practically useable due to accuracy and real-time performance via parallel decomposition.

show abstract

Section: Multi-word Phrasesmentioning

confidence: 95%

Section: Ness: Near-synonym Systemmentioning

confidence: 99%

Unsupervised Phrasal Near-Synonym Generation from Text Corpora

Gupta

Carbonell

Gershman

et al. 2015

AAAI

View full text Add to dashboard Cite

show abstract

“…Performance of applications relying on natural language processing may suffer from the fact that the processed documents might contain lexically different, yet semantically related, text segments. The task of recognizing synonym text segments, which is better known as paraphrase recognition, or detection, is challenging and difficult to solve, as shown in the work of Pasca (2005). The task itself is important for many text related applications, like summarization (Hirao, Fukusima, Oku-mura, Nobata, & Nanba, 2005), information extraction (Shinyama & Sekine, 2003) and question answering (Pasca, 2003).…”

Section: Paraphrase Recognition and Sentence-to-sentence Similaritymentioning

confidence: 99%

Text Relatedness Based on a Word Thesaurus

Tsatsaronis

Varlamis

Vazirgiannis

2010

jair

111

View full text Add to dashboard Cite

The computation of relatedness between two fragments of text in an automated manner requires taking into account a wide range of factors pertaining to the meaning the two fragments convey, and the pairwise relations between their words. Without doubt, a measure of relatedness between text segments must take into account both the lexical and the semantic relatedness between words. Such a measure that captures well both aspects of text relatedness may help in many tasks, such as text retrieval, classification and clustering. In this paper we present a new approach for measuring the semantic relatedness between words based on their implicit semantic links. The approach exploits only a word thesaurus in order to devise implicit semantic links between words. Based on this approach, we introduce Omiotis, a new measure of semantic relatedness between texts which capitalizes on the word-to-word semantic relatedness measure (SR) and extends it to measure the relatedness between texts. We gradually validate our method: we first evaluate the performance of the semantic relatedness measure between individual words, covering word-to-word similarity and relatedness, synonym identification and word analogy; then, we proceed with evaluating the performance of our method in measuring text-to-text semantic relatedness in two tasks, namely sentence-to-sentence similarity and paraphrase recognition. Experimental evaluation shows that the proposed method outperforms every lexicon-based method of semantic relatedness in the selected tasks and the used data sets, and competes well against corpus-based and hybrid approaches.

show abstract

“…The performance of document processing applications relying on natural language processing may suffer from the fact that the processed documents might contain lexically different, yet semantically related, text segments. The task of recognizing pairs of text segments, with identical or almost identical semantics, which is better known as paraphrase detection, is challenging and difficult to solve, as shown in the work of Mihalcea et al [66], and Pasca [85]. The task itself is important for many text related applications, like summarization [38], information extraction [104] and question answering [84].…”

Section: Paraphrasingmentioning

confidence: 99%

Word sense disambiguation and text relatedness based on word thesauri

Τσατσαρώνης¹

View full text Add to dashboard Cite

Mining Paraphrases from Self-anchored Web Sentence Fragments

Cited by 4 publications

References 15 publications

Unsupervised Phrasal Near-Synonym Generation from Text Corpora

Unsupervised Phrasal Near-Synonym Generation from Text Corpora

Text Relatedness Based on a Word Thesaurus

Word sense disambiguation and text relatedness based on word thesauri

Contact Info

Product

Resources

About