2008
DOI: 10.1007/978-3-540-89197-0_57
|View full text |Cite
|
Sign up to set email alerts
|

Advancing Topic Ontology Learning through Term Extraction

Abstract: This paper presents a novel methodology for topic ontology learning from text documents. The proposed methodology, named OntoTermExtraction is based on OntoGen, a semi-automated tool for topic ontology construction, upgraded by using and an advanced terminology extraction tool in an iterative, semiautomated ontology construction process. This process consists of (a) document clustering to find the nodes in the topic ontology, (b) term extraction from document clusters, (c) populating the term vocabulary and ke… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0
1

Year Published

2010
2010
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 6 publications
0
11
0
1
Order By: Relevance
“…Heuristic AUC Interval (6) freqDomnProdRel 93,71% 0,40% (21) outFreqRelSum 95,33% 0,35% (13) simDomnRatioMin 93,58% 0,00% (19) outFreqRelRF 95,24% 0,55% (7) tfidfSum 93,58% 0,00% (20) outFreqRelSVM 95,06% 1,26% (9) tfidfDomnProd 93,47% 0,39% (18) outFreqRelCS 94,96% 1,30% (5) freqDomnProd 93,42% 0,44% (17) outFreqSum 94,96% 0,70% (3) freqRatio 93,35% 5,23% (8) tfidfAvg 94,87% 0,00% (23) appearInAllDomn 93,31% 6,69% (15) outFreqRF 94,73% 1,53% (12) simDomnProd 93,27% 0,00% (16) outFreqSVM 94,70% 2,06% (1) freqTerm 93,20% 0,50% (14) outFreqCS 94,67% 1,80% (2) freqDoc 93,19% 0,50% (4) freqDomnRatioMin 94,36% 0,62% (11) simAvgTerm 92,71% 0,00% (10) tfidfDomnSum 93,85% 0,35% (22) random 50,00% 50,00%…”
Section: Comparison Of the Heuristicsmentioning
confidence: 99%
See 1 more Smart Citation
“…Heuristic AUC Interval (6) freqDomnProdRel 93,71% 0,40% (21) outFreqRelSum 95,33% 0,35% (13) simDomnRatioMin 93,58% 0,00% (19) outFreqRelRF 95,24% 0,55% (7) tfidfSum 93,58% 0,00% (20) outFreqRelSVM 95,06% 1,26% (9) tfidfDomnProd 93,47% 0,39% (18) outFreqRelCS 94,96% 1,30% (5) freqDomnProd 93,42% 0,44% (17) outFreqSum 94,96% 0,70% (3) freqRatio 93,35% 5,23% (8) tfidfAvg 94,87% 0,00% (23) appearInAllDomn 93,31% 6,69% (15) outFreqRF 94,73% 1,53% (12) simDomnProd 93,27% 0,00% (16) outFreqSVM 94,70% 2,06% (1) freqTerm 93,20% 0,50% (14) outFreqCS 94,67% 1,80% (2) freqDoc 93,19% 0,50% (4) freqDomnRatioMin 94,36% 0,62% (11) simAvgTerm 92,71% 0,00% (10) tfidfDomnSum 93,85% 0,35% (22) random 50,00% 50,00%…”
Section: Comparison Of the Heuristicsmentioning
confidence: 99%
“…Other interesting approaches to identifying concepts include methods such as KeyGraph [13], which extract terms and concepts with minimal assumptions or background knowledge, even from individual documents. Other alternatives are using domain ontologies which could be, for example, semi-automatically retrieved by a combination of tools such as OntoGen and TermExtractor [9].…”
Section: Background Knowledgementioning
confidence: 99%
“…If term consistently used across domain documents with high frequency [27], it could be regarded as domain concept. Domain Consensus could be defined as: For term t in domain documents D=(d 1 ,d 2 ,……, d n ):…”
Section: Statistic Filteringmentioning
confidence: 99%
“…(2) Except for the natures mentioned above, there are still other nature, including Structural Relevance, and Miscellaneous [27], but unfortunately, they are not operational, so they had not been taken into consideration. Another important nature is Lexical Cohesion [27,28,29] which measures terms with length above 2.In this paper, we hasn't adopted, for concepts extracted out shows that terms are almost short.…”
Section: Statistic Filteringmentioning
confidence: 99%
See 1 more Smart Citation