2008
DOI: 10.1075/term.14.2.05kit
|View full text |Cite
|
Sign up to set email alerts
|

Measuring mono-word termhood by rank difference via corpus comparison

Abstract: Measuring mono-word termhood by rank difference via corpus comparisonChunyu Kit and Xiaoyue Liu Terminology as a set of concept carriers crystallizes our special knowledge about a subject. Automatic term recognition (ATR) plays a critical role in the processing and management of various kinds of information, knowledge and documents, e.g., knowledge acquisition via text mining. Measuring termhood properly is one of the core issues involved in ATR. This article presents a novel approach to termhood measurement f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
33
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 41 publications
(33 citation statements)
references
References 37 publications
(50 reference statements)
0
33
0
Order By: Relevance
“…As illustrated in Figure 7, the normalized rank measures as given in formulas (m) and (n) achieves a similar performance with the enhanced rank measures. And the effect of count on it is not as much as that on formulas (i) and (j), which has been pointed out in [7] and also been exemplified by the curve lines in Figures 7 and 8. …”
Section: Performances Of Measures Based On Rank Figure 5 Performancementioning
confidence: 88%
See 1 more Smart Citation
“…As illustrated in Figure 7, the normalized rank measures as given in formulas (m) and (n) achieves a similar performance with the enhanced rank measures. And the effect of count on it is not as much as that on formulas (i) and (j), which has been pointed out in [7] and also been exemplified by the curve lines in Figures 7 and 8. …”
Section: Performances Of Measures Based On Rank Figure 5 Performancementioning
confidence: 88%
“…Kit and Liu [7] present an approach to quantifying the termhood of a term candidate as its rank difference in a domain and a background corpus via corpus comparison. With Hong Kong legal texts as the domain corpus, this approach achieves a precision of 97.0% on the top 1000 candidates and a precision of 96.1% on the top 10% candidates of the sorted term list.…”
Section: Comparison By Statisticsmentioning
confidence: 99%
“…For instance, Kit and Liu (2008) write: "Terms are linguistic representations of domain-specific key concepts in a subject field that crystallise our expert knowledge about that subject. "…”
Section: Defining a Termmentioning
confidence: 99%
“…Then, each candidate receives a value calculated by some statistical or hybrid measure or some combination of measures (and/or heuristics) [2][3][4][5]. These measures may be the candidate frequency or the accounting of distribution (e.g., the weirdness measure [7]) or occurrence probability (e.g., glossEx [3]) of candidates in a domain corpus and in a general language corpus.…”
Section: Related Workmentioning
confidence: 99%
“…After that, they apply measures or some combinations of measures (and/or heuristics) to form a rank of candidates [1][2][3][4][5]. Then, domain experts and/or terminologists analyze the rank in order to choose a threshold at which the candidates that have values above this threshold are selected as true terms.…”
Section: Introductionmentioning
confidence: 99%