Measuring mono-word termhood by rank difference via corpus comparison

Kit, Chunyu; Liu, Xiaoyue

doi:10.1075/term.14.2.05kit

Cited by 41 publications

(33 citation statements)

References 37 publications

(50 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As illustrated in Figure 7, the normalized rank measures as given in formulas (m) and (n) achieves a similar performance with the enhanced rank measures. And the effect of count on it is not as much as that on formulas (i) and (j), which has been pointed out in [7] and also been exemplified by the curve lines in Figures 7 and 8. …”

Section: Performances Of Measures Based On Rank Figure 5 Performancementioning

confidence: 88%

“…Kit and Liu [7] present an approach to quantifying the termhood of a term candidate as its rank difference in a domain and a background corpus via corpus comparison. With Hong Kong legal texts as the domain corpus, this approach achieves a precision of 97.0% on the top 1000 candidates and a precision of 96.1% on the top 10% candidates of the sorted term list.…”

Section: Comparison By Statisticsmentioning

confidence: 99%

See 1 more Smart Citation

Statistical termhood measurement for mono-word terms via corpus comparison

Liu

Kit

2009

2009 International Conference on Machine Learning and Cybernetics

Self Cite

View full text Add to dashboard Cite

Abstract:This paper examines the performance of a number of statistical measures for mono-word termhood within a corpus comparison framework. These measures are defined in terms of the frequency, information, and rank of a term candidate in a domain and a background corpus. The evaluation results from our experiments reveal interesting characteristics of each metric and verify the outstanding performance of those based on enhanced rank and information in identifying true terms.

show abstract

Section: Performances Of Measures Based On Rank Figure 5 Performancementioning

confidence: 88%

Section: Comparison By Statisticsmentioning

confidence: 99%

Statistical termhood measurement for mono-word terms via corpus comparison

Liu

Kit

2009

2009 International Conference on Machine Learning and Cybernetics

Self Cite

View full text Add to dashboard Cite

show abstract

“…For instance, Kit and Liu (2008) write: "Terms are linguistic representations of domain-specific key concepts in a subject field that crystallise our expert knowledge about that subject. "…”

Section: Defining a Termmentioning

confidence: 99%

Variation as a cognitive device

Pecman¹

2014

TERM

View full text Add to dashboard Cite

In Languages for Specific Purposes (LSPs), variation and term formation are often seen as related phenomena, variation being interpreted as a sign of neology. In scientific discourse though, variation can play specific roles, thereby giving a different dimension to neology as a linguistic process than generally implied in terminological studies. The well-known referential function, consisting of creating new designations for naming new concepts, can be set aside in scientific texts to create space for what we define as the cognitive function: a situation where a scientist purposefully employs term variation as a means for theorising and better explaining a given concept. We argue that Halliday's "grammatical metaphor" and "given-new" information theory provide an interesting background for understanding scientific term formation processes, and the ensuing issue of terminological variation. Consequently, in this article, we try to place the phenomenon of neology and of terminological variation within the framework of discourse analysis, by devising a method for probing sequential behaviour of terminological variants across text sections. Additionally, this study aims to improve building lexical resources within the ARTES terminological and phraseological multilingual database project, which serves as a support for developing lexicographical and translational skills in students in specialised translation.

show abstract

“…Then, each candidate receives a value calculated by some statistical or hybrid measure or some combination of measures (and/or heuristics) [2][3][4][5]. These measures may be the candidate frequency or the accounting of distribution (e.g., the weirdness measure [7]) or occurrence probability (e.g., glossEx [3]) of candidates in a domain corpus and in a general language corpus.…”

Section: Related Workmentioning

confidence: 99%

“…After that, they apply measures or some combinations of measures (and/or heuristics) to form a rank of candidates [1][2][3][4][5]. Then, domain experts and/or terminologists analyze the rank in order to choose a threshold at which the candidates that have values above this threshold are selected as true terms.…”

Section: Introductionmentioning

confidence: 99%

The Main Challenge of Semi-Automatic Term Extraction Methods

Conrado¹,

Pardo²,

Rezende³

2015

Natural Language Processing and Cognitive Science

View full text Add to dashboard Cite

Term extraction is the basis for many tasks such as building of taxonomies, ontologies and dictionaries, for translation, organization and retrieval of textual data. This paper studies the main challenge of semi-automatic term extraction methods, which is the difficulty to analyze the rank of candidates created by these methods. With the experimental evaluation performed in this work, it is possible to fairly compare a wide set of semi-automatic term extraction methods, which allows other future investigations. Additionally, we discovered which level of knowledge and threshold should be adopted for these methods in order to obtain good precision or F-measure. The results show there is not a unique method that is the best one for the three used corpora.

show abstract

Measuring mono-word termhood by rank difference via corpus comparison

Cited by 41 publications

References 37 publications

Statistical termhood measurement for mono-word terms via corpus comparison

Statistical termhood measurement for mono-word terms via corpus comparison

Variation as a cognitive device

The Main Challenge of Semi-Automatic Term Extraction Methods

Contact Info

Product

Resources

About