2011
DOI: 10.1007/s10579-011-9168-6
|View full text |Cite
|
Sign up to set email alerts
|

Semi-automatic enrichment of crowdsourced synonymy networks: the WISIGOTH system applied to Wiktionary

Abstract: International audienceSemantic lexical resources are a mainstay of various Natural Language Processing applications. However, comprehensive and reliable resources are rare and not often freely available. Handcrafted resources are too costly for being a general solution while automatically-built resources need to be validated by experts or at least thoroughly evaluated. We propose in this paper a picture of the current situation with regard to lexical resources, their building and their evaluation. We give an i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 31 publications
0
6
0
Order By: Relevance
“…This is not the case on the 3rd one where only about one third are confirmed. This is due to the fact that the synonyms network extracted from Wiktionary is very incomplete [14], which is not the case for the two previous networks that are based on linguistic resources that have been established for a long time. In the case of random network, the reported results are the average of the results obtained for 20 random networks of the same size, and we can notice that almost none of the edges are confirmed.…”
Section: Labeling Edges and Non-edges For Reflecting The Graph Topologymentioning
confidence: 98%
See 1 more Smart Citation
“…This is not the case on the 3rd one where only about one third are confirmed. This is due to the fact that the synonyms network extracted from Wiktionary is very incomplete [14], which is not the case for the two previous networks that are based on linguistic resources that have been established for a long time. In the case of random network, the reported results are the average of the results obtained for 20 random networks of the same size, and we can notice that almost none of the edges are confirmed.…”
Section: Labeling Edges and Non-edges For Reflecting The Graph Topologymentioning
confidence: 98%
“…There where digitalized from paper dictionaries (Robert and Larousse dictionaries) by an IBM/ATILF research unit partnership http://www.atilf.fr/spip.php?article208 3 V.wikt and V.pwn two synonymy networks between English verbs. V.wikt has been extracted from the English wiktionary by[14] whereas V.pwn is built from Princeton Wordnet[5] synsets. A synset is a set of interchangeable words that denotes a meaning or a particular usage.…”
mentioning
confidence: 99%
“…Simple procedures tend to provide correct but mostly irrelevant results. In Sajous et al (2013) an endogenous enrichment of Wiktionary is done with the use of a crowdsourcing tool. A similar approach of using crowdsourcing has been considering by (Zeichner et al (2012)) for evaluating inference rules that are discovered from texts.…”
Section: Inferring and Annotating Relationmentioning
confidence: 99%
“…Remaining information is encoded in wikicode, an underspecified format used by the MediaWiki content-management system. As explained by Sajous et al (2013b) and Sérasset (2012), this loose encoding format makes it difficult to extract consistent data. One can choose to either restrict the extraction to prototypical articles or design a fine-grained parser that collects the maximum of the available information.…”
Section: Turning the French Wiktionary Into A Machine-readable Dictionarymentioning
confidence: 99%