Automatic Keyword Extraction from Individual Documents

Rose, Stuart; Engel, Dave; Cramer, Nick; Cowley, Wendy

doi:10.1002/9780470689646.ch1

Cited by 767 publications

(592 citation statements)

References 9 publications

Supporting

Mentioning

575

Contrasting

Unclassified

Order By: Relevance

“…The labels which contain only one word after such filtering are removed. Then we use a simple heuristic observation that good label candidates usually do not contain stopword in the middle, see the study [11] for more details. One notable exception here is the word of.…”

Section: Processing Methodsmentioning

confidence: 99%

Tagging Scientific Publications Using Wikipedia and Natural Language Processing Tools

Łopuszyński

Bolikowski

2014

Communications in Computer and Information Science

View full text Add to dashboard Cite

Abstract. In this work, we compare two simple methods of tagging scientific publications with labels reflecting their content. As a first source of labels Wikipedia is employed, second label set is constructed from the noun phrases occurring in the analyzed corpus. We examine the statistical properties and the effectiveness of both approaches on the dataset consisting of abstracts from 0.7 million of scientific documents deposited in the ArXiv preprint collection. We believe that obtained tags can be later on applied as useful document features in various machine learning tasks (document similarity, clustering, topic modelling, etc.).

show abstract

Section: Processing Methodsmentioning

confidence: 99%

Tagging Scientific Publications Using Wikipedia and Natural Language Processing Tools

Łopuszyński

Bolikowski

2014

Communications in Computer and Information Science

View full text Add to dashboard Cite

show abstract

“…The co-occurrence of document terms in a graph-based representation is central also in [17], where the relevance of terms is computed on the base of word frequency, word degree, and ratio of degree to frequency. Degree is a measure devised to favor words that occur frequently and in longer candidate keywords.…”

Section: Related Workmentioning

confidence: 99%

“…Additionally, document-oriented methods "scale to vast collections and can be applied in many contexts to enrich IR systems and analysis tools" [17].…”

Section: Semantic Metrics For Keyword Extractionmentioning

confidence: 99%

Semantic Measures for Keywords Extraction

Colla

Mensa

Radicioni

2017

AI*IA 2017 Advances in Artificial Intelligence

View full text Add to dashboard Cite

Abstract. In this paper we introduce a minimalist hypothesis for keywords extraction: keywords can be extracted from text documents by considering concepts underlying document terms. Furthermore, central concepts are individuated as the concepts that are more related to title concepts. Namely, we propose five metrics, that are diverse in essence, to compute the centrality of concepts in the document body with respect to those in the title. We finally report about an experimentation over a popular data set of human annotated news articles; the results confirm the soundness of our hypothesis.

show abstract

“…it does not use any external knowledge resources, including Wikipedia. The approach is inspired by RAKE (Rose et al 2010) and KEA (Witten et al 1999).…”

Section: Extraction Of Abstractmentioning

confidence: 99%

Intelligent information processing for building university knowledge base

et al. 2016

View full text Add to dashboard Cite

There are many ready-to-use software solutions for building institutional scientific information platforms, most of which have functionality well suited to repository needs. However, there have already been discussions about various problems with institutional digital libraries. As a remedy, an approach that is researcher-centric (rather than documentcentric) has been proposed recently in some systems. This paper is devoted to research aimed at tools for building knowledge bases for university research. We focus on the AI methods that have been elaborated and applied practically within our platform for building such knowledge bases. In particular we present a novel approach to data acquisition and the semantic enrichment of the acquired data. In addition, we present the algorithms applied in the real life system for experts profiling and retrieval.

show abstract

Automatic Keyword Extraction from Individual Documents

Cited by 767 publications

References 9 publications

Tagging Scientific Publications Using Wikipedia and Natural Language Processing Tools

Tagging Scientific Publications Using Wikipedia and Natural Language Processing Tools

Semantic Measures for Keywords Extraction

Intelligent information processing for building university knowledge base

Contact Info

Product

Resources

About