Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 2020
DOI: 10.1145/3383583.3398517
|View full text |Cite
|
Sign up to set email alerts
|

Large-Scale Evaluation of Keyphrase Extraction Models

Abstract: Keyphrase extraction models are usually evaluated under different, not directly comparable, experimental setups. As a result, it remains unclear how well proposed models actually perform, and how they compare to each other. In this work, we address this issue by presenting a systematic large-scale analysis of state-ofthe-art keyphrase extraction models involving multiple benchmark datasets from various sources and domains. Our main results reveal that state-of-the-art models are in fact still challenged by sim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 30 publications
0
9
0
2
Order By: Relevance
“…However, this test set still allow for evaluation of how much important information these vocabularies can extract. Our performance evaluation uses a widely used exact match evaluation method [3,21], where only the extract matches with the gold standard are considered as true positives. Specifically, P recision = N umber of M atched in Abstract i N umber of Extracted in Abstract i , Recall = N umber of M atched in Abstract i N umber of Annotated in Abstract i .…”
Section: Evaluation Of Phrase Extraction Based On Human Annotated Datamentioning
confidence: 99%
“…However, this test set still allow for evaluation of how much important information these vocabularies can extract. Our performance evaluation uses a widely used exact match evaluation method [3,21], where only the extract matches with the gold standard are considered as true positives. Specifically, P recision = N umber of M atched in Abstract i N umber of Extracted in Abstract i , Recall = N umber of M atched in Abstract i N umber of Annotated in Abstract i .…”
Section: Evaluation Of Phrase Extraction Based On Human Annotated Datamentioning
confidence: 99%
“…We follow these guidelines strictly, when it comes to the use of identical datasets and gold-standard keyword sets, but somewhat deviate from them when it comes to the employment of identical preprocessing techniques and parameter settings employed for different approaches. Since all unsupervised approaches operate on a set of keyphrase candidates, extracted from the input document, Gallina et al (2020) argues that the extraction of these candidates and other parameters should be identical (e.g., they select the sequences of adjacent nouns with one or more preceding adjectives of length up to five words in order to extract keyword candidates) for a fair comparison between algorithms. On the other hand, we are more interested in comparison between keyword extraction approaches instead of algorithms alone and argue that the distinct keyword candidate extraction techniques are inseparable from the overall approach and should arguably be optimized for each distinct algorithm.…”
Section: Discussionmentioning
confidence: 99%
“…Πολλές διαδικασίες διαφόρων πεδίων μπορούν να ωφεληθούν από την επιτυχημένη εξαγωγή λέξεων ή φράσεων κλειδιών, όπως η ομαδοποίηση κειμένων (document clustering) (Shubankar et al, 2011;Kim and Gil, 2019;Karpagam and Saradha, 2019), η κατηγοριοποίηση/ταξινόμηση κειμένων (text classification) (Hulth and Megyesi, 2006;Meng et al, 2019), προβλήματα ανάκτησης πληροφορίας (information retrieval) (Ji et al, 2019;Boudin et al, 2020), όπως η διεύρυνση ερωτημάτων (query expansion) (Song et al, 2006) και η πολυσύνθετη αναζήτηση (faceted search) , η εξαγωγή περίληψης από κείμενα (text summarization) (Zhang et al, 2004;Litvak and Last, 2008;Song et al, 2019), η αναγνώριση οντοτήτων (entity recognition) (Du et al, 2018) και ο εντοπισμός γεγονότων (event detection) (Hossny et al, 2020). Ο σημαντικός ρόλος των φράσεων κλειδιών σε μεθόδους διαφόρων πεδίων (όπως οι παραπάνω) σε συνδυασμό με την αύξηση της ποσότητας της ψηφιακής πληροφορίας κειμένου στο Διαδίκτυο (διαδικτυακές ψηφιακές βιβλιοθήκες, ηλεκτρονικές εφημερίδες, διαδικτυακά περιοδικά, κριτικές πελατών σε πλατφόρμες ηλεκτρονικού εμπορίου, κ.α.)…”
Section: Discussionunclassified
“…Several tasks can benefit from accurate keyword or keyphrase extraction outcomes, including document clustering (Shubankar et al, 2011;Kim and Gil, 2019;Karpagam and Saradha, 2019), text classification (Hulth and Megyesi, 2006;Meng et al, 2019), information retrieval tasks (Ji et al, 2019;Boudin et al, 2020), such as query expansion (Song et al, 2006) and faceted search ), text summarization (Zhang et al, 2004;Litvak and Last, 2008;Song et al, 2019), entity recognition (Du et al, 2018), and event detection (Hossny et al, 2020). The crucial role of keyphrases in these tasks along with the increasing online digital textual information (e.g., scientific digital libraries, e-newspapers, online magazines, customer reviews on e-commerce platforms, blogs, etc.)…”
Section: The Task Of Keyphrase Extractionmentioning
confidence: 99%
See 1 more Smart Citation