Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019) 2019
DOI: 10.18653/v1/s19-1001
|View full text |Cite
|
Sign up to set email alerts
|

SURel: A Gold Standard for Incorporating Meaning Shifts into Term Extraction

Abstract: We introduce SURel, a novel dataset for German with human-annotated meaning shifts between general-language and domain-specific contexts. We show that meaning shifts of term candidates cause errors in term extraction, and demonstrate that the SURel annotation reflects these errors. Furthermore, we illustrate that SURel enables us to assess optimisations of term extraction techniques when incorporating meaning shifts.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 21 publications
(19 reference statements)
0
10
0
Order By: Relevance
“…We compare our proposed method (NN) to the method of Hamilton et al (2016b) described in Section 4 (AlignCos), in which the vector spaces are first aligned using the OP algorithm, and then words are ranked according to the cosine-distance between the word representation in the two spaces. 5 This method was shown to outperform all others that were compared to it by Schlechtweg et al (2019).…”
Section: Methodsmentioning
confidence: 94%
See 1 more Smart Citation
“…We compare our proposed method (NN) to the method of Hamilton et al (2016b) described in Section 4 (AlignCos), in which the vector spaces are first aligned using the OP algorithm, and then words are ranked according to the cosine-distance between the word representation in the two spaces. 5 This method was shown to outperform all others that were compared to it by Schlechtweg et al (2019).…”
Section: Methodsmentioning
confidence: 94%
“…This field of semantic change suffers from lack of proper evaluation datasets, and there is no common benchmark that is being used. Two new datasets were recently introduced, and used to extensively compare between previous methods (Schlechtweg et al, 2019): the DURel dataset (Schlechtweg et al, 2018) focuses on diachronic changes, while the SURel dataset (Hätty et al, 2019) focuses on domain-based semantic changes. We use them to verify the quality of our results and compare against AlignCos (Hamilton et al, 2016b).…”
Section: Quantitative Evaluation: Durel and Surel Datasetsmentioning
confidence: 99%
“…A simple improvement of approach in [53] would be to use contextualized embedding models pre-trained on domain-specific collections such as SciBERT that were trained on scholarly corpora [68]. Another one is to consider specialized term extraction techniques such as ones based on recognizing meaning shifts between general and domain-specific language [69]. In our work we further extend change-based scholarly data summarization approaches based on these and further ideas.…”
Section: Temporal Analysis Of Scholarly Datamentioning
confidence: 99%
“…Biemann, 2013). Studies on graded word meaning, however, cover only small amounts of data (Soares da Silva, 1992;Brown, 2008;McCarthy and Navigli, 2009;Erk et al, 2009Erk et al, , 2013Hätty et al, 2019).…”
Section: Related Workmentioning
confidence: 99%