Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.258
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Degrees of Technicality in Automatic Terminology Extraction

Abstract: While automatic term extraction is a wellresearched area, computational approaches to distinguish between degrees of technicality are still understudied. We semi-automatically create a German gold standard of technicality across four domains, and illustrate the impact of a web-crawled general-language corpus on predicting technicality. When defining a classification approach that combines general-language and domain-specific word embeddings, we go beyond previous work and align vector spaces to gain comparativ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
2

Relationship

1
9

Authors

Journals

citations
Cited by 16 publications
(15 citation statements)
references
References 22 publications
(25 reference statements)
0
12
0
Order By: Relevance
“…We adapt the following models on relevant tasks in our setting with additional inputs (e.g., domain-specific corpora): (Amjadian et al, 2016(Amjadian et al, , 2018). • Multi-Channel (MC): Multi-Channel (Hätty et al, 2020) is the state-of-the-art model for automatic term extraction, which is based on a multi-channel neural network that takes domainspecific and general corpora as input.…”
Section: Methodsmentioning
confidence: 99%
“…We adapt the following models on relevant tasks in our setting with additional inputs (e.g., domain-specific corpora): (Amjadian et al, 2016(Amjadian et al, , 2018). • Multi-Channel (MC): Multi-Channel (Hätty et al, 2020) is the state-of-the-art model for automatic term extraction, which is based on a multi-channel neural network that takes domainspecific and general corpora as input.…”
Section: Methodsmentioning
confidence: 99%
“…Ha¨tty proposes two novel models to exploit general-vs. domain-specific comparisons: a simple neural network model with pre-computed comparativeembedding information as input and a multi-channel model computing the comparison internally. Both models outperform previous approaches, with the multichannel model performing at the optimum level (Ha¨tty, Schlechtweg, Dorna, & Im Walde, 2020). Among these methods, Long-Short Term Memory Network (LSTM) (Zhao, Du, & Shi, 2018) and CRF (Wang, Wang, Deng, & Wu, 2016) and their variants achieve the best performance.…”
Section: Related Workmentioning
confidence: 99%
“…Several approaches build on word embed-dings to perform ATE on specific domains, such as medicine (e.g. Bay et al, 2020), or to separate general-language from domain-specific embeddings (Hätty et al, 2020). In contrast, our models perform ATE on four domains and in three languages utilizing a pretrained language and a pretrained NMT model.…”
Section: Related Workmentioning
confidence: 99%