PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction

Schopf, Tim; Klimek, Simon; Matthes, Florian

doi:10.5220/0011546600003335

Cited by 14 publications

(3 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This analysis of term occurrence used the KeyphraseVectorizer package in Python. KeyphraseVectorizer extracts key phrases matching specific parts of speech (in our case a noun phrase) in a document collection and counts their occurrences per document in the collection (Schopf et al, 2022). The phrases relevant to policy process, policy design, or policy evaluation were then identified and classified based on our knowledge of public policy.…”

Section: Methodsmentioning

confidence: 99%

Brown-out of policy ideas? A bibliometric review and computational text analysis of research on energy access

Goyal,

Howlett

2023

Front. Sustain. Energy Policy

View full text Add to dashboard Cite

IntroductionThe target of universal access to affordable, reliable, and modern energy services—key for individual, social, and economic well-being—is unlikely to be achieved by 2030 based on the current trend. Public policy will likely need to play a key role in accelerating progress in this regard. Although perspectives from the field of policy studies can support this effort, to what extent they have been employed in the literature on energy access remains unclear.MethodsThis study analyzed nearly 7,500 publications on energy access through a combination of bibliometric review and computational text analysis of their titles and abstracts to examine whether and how they have engaged with public policy perspectives, specifically, policy process research, policy design studies, and the literature on policy evaluation.ResultsWe discovered 27 themes in the literature on energy access, but public policy was not among them. Subsequently, we identified 23 themes in a new analysis of the 1,751 publications in our original dataset, mentioning “policy” in their title or abstract. However, few of them engaged with public policy, and even those that did comprised a rather small share of the literature. Finally, we extracted phrases pertaining to public policy in this reduced dataset, but found limited mention of terms related to the policy process, policy design, or policy evaluation.DiscussionWhile to some extent this might reflect the multidisciplinary nature of the research on energy access, a manual review of the abstracts of select publications corroborated this finding. Also, it shed light on how the literature has engaged with public policy and helped identify opportunities for broadening and deepening policy relevant research on energy access. We conclude that, despite their relevance to energy access, public policy perspectives have infrequently and unevenly informed existing research on the topic, and call on scholars in both communities to address this gap in the future.

show abstract

Section: Methodsmentioning

confidence: 99%

Brown-out of policy ideas? A bibliometric review and computational text analysis of research on energy access

Goyal,

Howlett

2023

Front. Sustain. Energy Policy

View full text Add to dashboard Cite

show abstract

“…In recent years, KGs have emerged as an approach for semantically representing knowledge about real-world entities in a machine-readable format. In contrast to semantic text representations, which may be used primarily for similarity comparisons in a variety of different NLP downstream tasks (Braun et al 2021;Schopf et al 2021a;Schopf et al 2021b;Schopf et al 2022d;Schopf et al 2022c;Schopf et al 2022a;Schneider et al 2022b), KGs can additionally capture all kinds of semantic relationships between different entities. Despite the rising popularity of KGs, there is still no common understanding of what exactly a KG is.…”

Section: Knowledge Graph Conceptmentioning

confidence: 99%

Informing Possible Future Worlds

2024

View full text Add to dashboard Cite

For more than 35 years, Ulrich Frank has shaped Wirtschaftsinformatik as a scientific discipline through thoughtful and sophisticated research contributions and his numerous contributions to the scientific community.Informing Possible Future Worlds is the Festschrift in honour of Ulrich Frank on the occasion of his 65 th birthday. The Festschrift includes twenty-three essays written by friends, colleagues, and fellow researchers in recognition of Ulrich's contributions to Wirtschaftsinformatik research and the scientific community. Each essay is a personal and unique 'birthday present' to Ulrich Frank written exclusively for the Festschrift. From original research contributions to more personal reflections, the essays cover a wide range of topics, themes, and fields -just like Ulrich Frank's contributions.For more than 35 years, Ulrich Frank has shaped and promoted Wirtschaftsinformatik as a scientific discipline through his thoughtful and sophisticated research contributions and his numerous contributions to the scientific community. He has initiated, engaged in and promoted scientific discourse nationally and internationally, inspired, encouraged and guided young researchers, and played an outstanding part in the Wirtschaftsinformatik community.Starting with his Diplomarbeit in 1982 on 'Die Problematisierung von Zielbildungsprozessen in Unternehmungen durch die Betriebswirtschaftslehre unter besonderer Berücksichtigung des Verhältnisses von Wissenschaft und Praxis' (supervised by Erwin Grochla at the Universität zu Köln), Ulrich has cultivated his interest not only in Betriebswirtschaftslehre and Wirtschaftsinformatik, Software Engineering and Conceptual Modelling, but also in the Philosophy of Science, the Philosophy of Language, the Sociology of Knowledge and in Organisational Sociology. For his Diplomarbeit, he read Albert, Feyerabend, Mittelstrass, and Popper, among others, and reflected on organisational goal-setting ('Zielbildungsprozesse') from multiple, complementary perspectives -a lifelong leitmotiv he later incorporated in his Multi-Perspective Enterprise Modelling (MEMO) method.An avid reader and bibliophile, Ulrich must have learned about Niklas Luhmann, the German sociologist working at Universität Bielefeld, Ulrich's hometown, likely in his teens, and started his own personal discourse with Luhmann's writings, and from there on expanded his readings. At about the same time, Ulrich started to program computers, developed and sold his first software application, and, most importantly, discovered Smalltalk, the programming language created by Alan Kay and others at Xerox PARC's Learning Research Group -which has strongly influenced him to this day and which he has admired ever since.Following his discovery of Smalltalk, he took a deep dive into Object-Oriented Programming and Object-Oriented Modelling that took him to take a minor in Angewandte Informatik (Applied Informatics) at the Universität zu Köln during his graduate studies.vii His doctoral research with Alfred Kieser at Universität Mannheim ...

show abstract

“…JI computes the Jaccard index for sample i and represents the annotator agreement for that sample. While calculating the intersection and union of the two sets, we considered the exact string match between the elements of the sets as used in Schopf, Klimek, and Matthes (2022). We used Avg.…”

Section: Validation Of Annotationmentioning

confidence: 99%

Theme-Driven Keyphrase Extraction to Analyze Social Media Discourse

Romano,

Sharif,

Basak

et al. 2024

ICWSM

View full text Add to dashboard Cite

Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis, a gap remains in applying keyphrase extraction to health-related content. Keyphrase extraction is used to identify salient concepts in social media discourse without being constrained by predefined entity classes. This paper introduces a theme-driven keyphrase extraction framework tailored for social media, a pioneering approach designed to capture clinically relevant keyphrases from user-generated health texts. Themes are defined as broad categories determined by the objectives of the extraction task. We formulate this novel task of theme-driven keyphrase extraction and demonstrate its potential for efficiently mining social media text for the use case of treatment for opioid use disorder. This paper leverages qualitative and quantitative analysis to demonstrate the feasibility of extracting actionable insights from social media data and efficiently extracting keyphrases using minimally supervised NLP models. Our contributions include the development of a novel data collection and curation framework for theme-driven keyphrase extraction and the creation of SuboxoPhrase, the first dataset of its kind comprising human-annotated keyphrases from a Reddit community. We also identify the scope of minimally supervised NLP models to extract keyphrases from social media data efficiently. Lastly, we found that a large language model (ChatGPT) outperforms unsupervised keyphrase extraction models, showcasing its efficacy in this task.

show abstract

PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction

Cited by 14 publications

References 17 publications

Brown-out of policy ideas? A bibliometric review and computational text analysis of research on energy access

Brown-out of policy ideas? A bibliometric review and computational text analysis of research on energy access

Informing Possible Future Worlds

Theme-Driven Keyphrase Extraction to Analyze Social Media Discourse

Contact Info

Product

Resources

About