Marieke van Erp scite author profile

This shared task focuses on identifying unusual, previously-unseen entities in the context of emerging discussions. Named entities form the basis of many modern approaches to other tasks (like event clustering and summarization), but recall on them is a real problem in noisy text -even among annotators. This drop tends to be due to novel entities and surface forms. Take for example the tweet "so.. kktny in 30 mins?!" -even human experts find the entity kktny hard to detect and resolve. The goal of this task is to provide a definition of emerging and of rare entities, and based on that, also datasets for detecting these entities. The task as described in this paper evaluated the ability of participating entries to detect and classify novel and emerging named entities in noisy text.

show abstract

Analysis of named entity recognition and linking for tweets

Derczynski

Maynard

Rizzo

et al. 2015

Information Processing & Management

276

186

View full text Add to dashboard Cite

Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.

show abstract

Building event-centric knowledge graphs from news

Rospocher

Erp

Vossen

et al. 2016

Journal of Web Semantics

127

109

View full text Add to dashboard Cite

SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering

Minard¹,

Speranza²,

Agirre³

et al. 2015

View full text Add to dashboard Cite

This paper describes the outcomes of the TimeLine task (Cross-Document Event Ordering), that was organised within the Time and Space track of SemEval-2015. Given a set of documents and a set of target entities, the task consisted of building a timeline for each entity, by detecting, anchoring in time and ordering the events involving that entity. The TimeLine task goes a step further than previous evaluation challenges by requiring participant systems to perform both event coreference and temporal relation extraction across documents. Four teams submitted the output of their systems to the four proposed subtracks for a total of 13 runs, the best of which obtained an F 1 -score of 7.85 in the main track (timeline creation from raw text).

show abstract

NewsReader: Using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news

Vossen

Agerri

Aldabe

et al. 2016

Knowledge-Based Systems

View full text Add to dashboard Cite

Semantic technologies for historical research: A survey

Meroño‐Peñuela

Ashkpour

Erp

et al. 2014

View full text Add to dashboard Cite

Abstract. The diversity of sources of information for historical research fill a continuum between individual accounts transmitted for instance in letters but also in poems and songs, and aggregated statistical information as in the case of historical census. Historiography shares this heterogeneity and complexity of source material with other humanities fields. Methods to order this rich material, and by this ordering also to determine the way history is told are as old as history writing and vary among the different branches (or subdisciplines) of historical research.In this paper we focus on the work of historians, and even more specifically economic and social history.At the crossroad of information and historical sciences, so-called Historical Informatics or History and Computing emerged as a specific profession during the nineties of the last century. Together with computer scientists historians created a research agenda concentrating around questions how to create, design, enrich, edit, retrieve, analyze and present historical information with help of information technology. There exist a number problems and challenges in this field; some of them are closely related to semantics and meaning of knowledge in general. In this context, Semantic Web technologies can be applied in a number of situations, environments, applications of historical computing and historical information science. However, only a few number of contributions have yet considered these technologies. In this survey we present an overview of the past and present problems, challenges and advances of historical science computing, from out the perspective of Semantic technology.

show abstract

Lessons learnt from the Named Entity rEcognition and Linking (NEEL) challenge series

Rizzo

Pereira

Varga³

et al. 2017

View full text Add to dashboard Cite

Abstract. The large number of tweets generated daily is providing decision makers with means to obtain insights into recent events around the globe in near real-time. The main barrier for extracting such insights is the impossibility of manual inspection of a diverse and dynamic amount of information. This problem has attracted the attention of industry and research communities, resulting in algorithms for the automatic extraction of semantics in tweets and linking them to machine readable resources. While a tweet is shallowly comparable to any other textual content, it hides a complex and challenging structure that requires domainspecific computational approaches for mining semantics from it. The NEEL challenge series, established in 2013, has contributed to the collection of emerging trends in the field and definition of standardised benchmark corpora for entity recognition and linking in tweets, ensuring high quality labelled data that facilitates comparisons between different approaches. This article reports the findings and lessons learnt through an analysis of specific characteristics of the created corpora, limitations, lessons learnt from the different participants and pointers for furthering the field of entity recognition and linking in tweets.

show abstract

LOTUS: Adaptive Text Search for Big Linked Data

Ilievski

Beek

Erp

et al. 2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Marieke van Erp

Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition

Analysis of named entity recognition and linking for tweets

Building event-centric knowledge graphs from news

SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering

NewsReader: Using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news

Semantic technologies for historical research: A survey

Lessons learnt from the Named Entity rEcognition and Linking (NEEL) challenge series

LOTUS: Adaptive Text Search for Big Linked Data

Contact Info

Product

Resources

About