Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management 2022
DOI: 10.5220/0011546600003335
|View full text |Cite
|
Sign up to set email alerts
|

PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction

Abstract: Keyphrase extraction is the process of automatically selecting a small set of most relevant phrases from a given text. Supervised keyphrase extraction approaches need large amounts of labeled training data and perform poorly outside the domain of the training data (Bennani-Smires et al., 2018). In this paper, we present PatternRank, which leverages pretrained language models and part-of-speech for unsupervised keyphrase extraction from single documents. Our experiments show PatternRank achieves higher precisio… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(3 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…This analysis of term occurrence used the KeyphraseVectorizer package in Python. KeyphraseVectorizer extracts key phrases matching specific parts of speech (in our case a noun phrase) in a document collection and counts their occurrences per document in the collection (Schopf et al, 2022). The phrases relevant to policy process, policy design, or policy evaluation were then identified and classified based on our knowledge of public policy.…”
Section: Methodsmentioning
confidence: 99%
“…This analysis of term occurrence used the KeyphraseVectorizer package in Python. KeyphraseVectorizer extracts key phrases matching specific parts of speech (in our case a noun phrase) in a document collection and counts their occurrences per document in the collection (Schopf et al, 2022). The phrases relevant to policy process, policy design, or policy evaluation were then identified and classified based on our knowledge of public policy.…”
Section: Methodsmentioning
confidence: 99%
“…In recent years, KGs have emerged as an approach for semantically representing knowledge about real-world entities in a machine-readable format. In contrast to semantic text representations, which may be used primarily for similarity comparisons in a variety of different NLP downstream tasks (Braun et al 2021;Schopf et al 2021a;Schopf et al 2021b;Schopf et al 2022d;Schopf et al 2022c;Schopf et al 2022a;Schneider et al 2022b), KGs can additionally capture all kinds of semantic relationships between different entities. Despite the rising popularity of KGs, there is still no common understanding of what exactly a KG is.…”
Section: Knowledge Graph Conceptmentioning
confidence: 99%
“…JI computes the Jaccard index for sample i and represents the annotator agreement for that sample. While calculating the intersection and union of the two sets, we considered the exact string match between the elements of the sets as used in Schopf, Klimek, and Matthes (2022). We used Avg.…”
Section: Validation Of Annotationmentioning
confidence: 99%