1995
DOI: 10.1017/s1351324900000048
|View full text |Cite
|
Sign up to set email alerts
|

Technical terminology: some linguistic properties and an algorithm for identification in text

Abstract: This paper identifies some linguistic properties of technical terminology, and uses them to formulate an algorithm for identifying technical terms in running text. The grammatical properties discussed are preferred phrase structures: technical terms consist mostly of noun phrases containing adjectives, nouns, and occasionally prepositions; rerely do terms contain verbs, adverbs, or conjunctions. The discourse properties are patterns of repetition that distinguish noun phrases that are technical terms, especial… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
327
0
4

Year Published

1998
1998
2014
2014

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 540 publications
(339 citation statements)
references
References 16 publications
5
327
0
4
Order By: Relevance
“…This filtering step eliminates only 0.9% of the valid gene and protein names from consideration and significantly increases the accuracy of the recognized terms. It is possible to replace this simple term extractor with a more sophisticated method, relying on frequency and distributional information (Justeson and Katz, 1995) and complementing that with techniques that utilize approximate string matching to identify term variants and new terms similar to old ones, such as the method proposed by Krauthammer et al (2000) specifically for biological terms.…”
Section: Data Collectionmentioning
confidence: 99%
“…This filtering step eliminates only 0.9% of the valid gene and protein names from consideration and significantly increases the accuracy of the recognized terms. It is possible to replace this simple term extractor with a more sophisticated method, relying on frequency and distributional information (Justeson and Katz, 1995) and complementing that with techniques that utilize approximate string matching to identify term variants and new terms similar to old ones, such as the method proposed by Krauthammer et al (2000) specifically for biological terms.…”
Section: Data Collectionmentioning
confidence: 99%
“…We drop all entries according to this heuristic rule. Naturally, many far more sophisticated algorithms can be employed here, e.g., matching grammatical pattern devised to select true keywords, which could be employed, when the knowledge about the part-of-speech classification is available [12,13]. However, the simple stopword method worked well enough for us, especially that we are mostly aiming at labels for further applications in machine learning and hence we can afford having certain fraction of "bogus labels".…”
Section: Processing Methodsmentioning
confidence: 99%
“…a dependency pattern with semantic (UMLS) class labels for ARG1 and ARG2. To find the appriate semantic label for a complex argument, we first extract its main term using a linguistic filter adapted from (Justeson and Katz, 1995). The filter extracts a sub-string of the argument that matches the following POS-tag regular expression:…”
Section: Learning Patternsmentioning
confidence: 99%