DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF

Sérasset, Gilles

doi:10.3233/sw-140147

Cited by 59 publications

(71 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…CompiLIG The best Spanish-English performance on SNLI sentences was achieved by CompiLIG using features including: cross-lingual conceptual similarity using DBNary (Serasset, 2015), cross-language MultiVec word embeddings (Berard et al, 2016), and Brychcin and Svoboda (2016)'s improvements to Sultan et al (2015)'s method. (Nagoudi et al, 2017) Using only weighted word embeddings, LIM-LIG took second place on Arabic.…”

Section: Methodsmentioning

confidence: 99%

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

Cer¹,

Diab²,

Agirre³

et al. 2017

Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

1,122

775

View full text Add to dashboard Cite

Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in all language tracks. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012)(2013)(2014)(2015)(2016)(2017).

show abstract

Section: Methodsmentioning

confidence: 99%

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

Cer¹,

Diab²,

Agirre³

et al. 2017

Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

1,122

775

View full text Add to dashboard Cite

show abstract

“…A bag-of-words S from each sentence S is built, by filtering stop words and by using a function that returns for a given word all its possible translations. These translations are jointly given by a linked lexical resource, DBNary (Sérasset, 2015), and by cross-lingual word embeddings. More precisely, we use the top 10 closest words in the embeddings model and all the available translations from DBNary to build the bag-of-words of a word.…”

Section: Cross-language Conceptualmentioning

confidence: 99%

CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity

Ferrero¹,

Besacier²,

Schwab³

et al. 2017

Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

View full text Add to dashboard Cite

We present our submitted systems for Semantic Textual Similarity (STS) Track 4 at SemEval-2017. Given a pair of SpanishEnglish sentences, each system must estimate their semantic similarity by a score between 0 and 5. In our submission, we use syntax-based, dictionary-based, context-based, and MT-based methods. We also combine these methods in unsupervised and supervised way. Our best run ranked 1 st on track 4a with a correlation of 83.02% with human annotations.

show abstract

“…We reuse the idea of Pataki (2012) which, for each sentence, build a bag-ofwords by getting all the available translations of each word of the sentence. For that, we use a linked lexical resource called DBNary (Sérasset, 2015). The bag-of-words of a sentence is the merge of the bag-of-words of the words of the sentence.…”

Section: Cross-languagementioning

confidence: 99%

“…We use the Muhr et al (2010)'s implementation which consists in replacing each word of one text by its most likely translations in the language of the other text, leading to a bags-of-words. We use DBNary (Sérasset, 2015) to get the translations. The metric used to compare two texts is a monolingual matching based on strict intersection of bags-of-words.…”

Section: Mt-based Modelsmentioning

confidence: 99%

Deep Investigation of Cross-Language Plagiarism Detection Methods

Ferrero¹,

Besacier²,

Schwab³

et al. 2017

Proceedings of the 10th Workshop on Building and Using Comparable Corpora

View full text Add to dashboard Cite

This paper is a deep investigation of cross-language plagiarism detection methods on a new recently introduced open dataset, which contains parallel and comparable collections of documents with multiple characteristics (different genres, languages and sizes of texts). We investigate cross-language plagiarism detection methods for 6 language pairs on 2 granularities of text units in order to draw robust conclusions on the best methods while deeply analyzing correlations across document styles and languages.

show abstract

DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF

Cited by 59 publications

References 5 publications

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity

Deep Investigation of Cross-Language Plagiarism Detection Methods

Contact Info

Product

Resources

About