Michael Färber scite author profile

In recent years, DBpedia, Freebase, OpenCyc, Wikidata, and YAGO have been published as noteworthy large, cross-domain, and freely available knowledge graphs. Although extensively in use, these knowledge graphs are hard to compare against each other in a given setting. Thus, it is a challenge for researchers and developers to pick the best knowledge graph for their individual needs. In our recent survey [2], we devised and applied data quality criteria to the above-mentioned knowledge graphs. Furthermore, we proposed a framework for finding the most suitable knowledge graph for a given setting. With this paper we intend to ease the access to our indepth survey by presenting simplified rules that map individual data quality requirements to specific knowledge graphs. However, this paper does not intend to replace the decision-support framework introduced in [2]. For an informed decision on which KG is best for you we still refer to our in-depth survey.

show abstract

The 5G candidate waveform race: a comparison of complexity and performance

Gerzaguet

Bartzoudis

Baltar

et al. 2017

J Wireless Com Network

167

115

View full text Add to dashboard Cite

The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data

Färber

2019

View full text Add to dashboard Cite

Citation recommendation: approaches and datasets

Färber

Jatowt

2020

Int J Digit Libr

View full text Add to dashboard Cite

Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing scientific texts on the other hand, citation recommendation has emerged as an important research topic. In recent years, several approaches and evaluation data sets have been presented. However, to the best of our knowledge, no literature survey has been conducted explicitly on citation recommendation. In this article, we give a thorough introduction to automatic citation recommendation research. We then present an overview of the approaches and data sets for citation recommendation and identify differences and commonalities using various dimensions. Last but not least, we shed light on the evaluation methods and outline general challenges in the evaluation and how to meet them. We restrict ourselves to citation recommendation for scientific publications, as this document type has been studied the most in this area. However, many of the observations and discussions included in this survey are also applicable to other types of text, such as news articles and encyclopedic articles.

show abstract

Future Mobile Communication Networks: Challenges in the Design and Operation

Marsch

Raaf

Szufarska

et al. 2012

IEEE Veh. Technol. Mag.

View full text Add to dashboard Cite

To Cite, or Not to Cite? Detecting Citation Contexts in Text

Färber

Thiemann

Jatowt

2018

View full text Add to dashboard Cite

On Emerging Entity Detection

Färber

Rettinger

Asmar

2016

View full text Add to dashboard Cite

unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata

Saier

Färber

2020

Scientometrics

View full text Add to dashboard Cite

In recent years, scholarly data sets have been used for various purposes, such as paper recommendation, citation recommendation, citation context analysis, and citation contextbased document summarization. The evaluation of approaches to such tasks and their applicability in real-world scenarios heavily depend on the used data set. However, existing scholarly data sets are limited in several regards. In this paper, we propose a new data set based on all publications from all scientific disciplines available on arXiv.org. Apart from providing the papers' plain text, in-text citations were annotated via global identifiers. Furthermore, citing and cited publications were linked to the Microsoft Academic Graph, providing access to rich metadata. Our data set consists of over one million documents and 29.2 million citation contexts. The data set, which is made freely available for research purposes, not only can enhance the future evaluation of research paper-based and citation context-based approaches, but also serve as a basis for new ways to analyze in-text citations, as we show prototypically in this article.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michael Färber

Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO

The 5G candidate waveform race: a comparison of complexity and performance

The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data

Citation recommendation: approaches and datasets

Future Mobile Communication Networks: Challenges in the Design and Operation

To Cite, or Not to Cite? Detecting Citation Contexts in Text

On Emerging Entity Detection

unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata

Contact Info

Product

Resources

About