2018
DOI: 10.1007/978-3-319-76941-7_63
|View full text |Cite
|
Sign up to set email alerts
|

A Text Feature Based Automatic Keyword Extraction Method for Single Documents

Abstract: In this work, we propose a lightweight approach for keyword extraction and ranking based on an unsupervised methodology to select the most important keywords of a single document. To understand the merits of our proposal, we compare it against RAKE, TextRank and SingleRank methods (three well-known unsupervised approaches) and the baseline TF.IDF, over four different collections to illustrate the generality of our approach. The experimental results suggest that extracting keywords from documents using our meth… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
58
0
4

Year Published

2019
2019
2019
2019

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 90 publications
(66 citation statements)
references
References 5 publications
0
58
0
4
Order By: Relevance
“…The great importance of using both statistics and contexts info is confirmed by recent methods such as YAKE (Campos et al, 2018b) and the method proposed by Won, Martins, and Raimundo (2019). YAKE, besides the term's position/frequency, also uses new statistical metrics that capture context information and the spread of the terms in the document.…”
Section: Statistics-based Methodsmentioning
confidence: 99%
“…The great importance of using both statistics and contexts info is confirmed by recent methods such as YAKE (Campos et al, 2018b) and the method proposed by Won, Martins, and Raimundo (2019). YAKE, besides the term's position/frequency, also uses new statistical metrics that capture context information and the spread of the terms in the document.…”
Section: Statistics-based Methodsmentioning
confidence: 99%
“…We adopted the same evaluation procedure as used for the series of results recently introduced by YAKE authors [6] 5 . Five fold cross validation was used to determine the overall performance, for which we measured Precision, Recall and F1 score, with the latter being reported in Table 2.…”
Section: Experimental Settingmentioning
confidence: 99%
“…Five fold cross validation was used to determine the overall performance, for which we measured Precision, Recall and F1 score, with the latter being reported in Table 2. 6 Keywords were stemmed prior to evaluation. 7 As the number of keywords in the gold standard document is not equal to the number of extracted keywords (in our experiments k=10), in the recall we divide the correctly extracted keywords by the number of keywords parameter k, if in the gold standard number of keywords is higher than k. Selecting default configuration.…”
Section: Experimental Settingmentioning
confidence: 99%
See 1 more Smart Citation
“…The next approach uses Yet Another Keyword Extractor (YAKE) [13], which is a statistical method for multi-lingual keyphrase extraction. Being an unsupervised method, YAKE avoids the problem of the long training process of other supervised methods and does not depend on any dictionaries for topic extraction.…”
Section: Interest Identification Using Yakementioning
confidence: 99%