Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.364
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking Scalable Methods for Streaming Cross Document Entity Coreference

Abstract: Streaming cross document entity coreference (CDC) systems disambiguate mentions of named entities in a scalable manner via incremental clustering. Unlike other approaches for named entity disambiguation (e.g., entity linking), streaming CDC allows for the disambiguation of entities that are unknown at inference time. Thus, it is well-suited for processing streams of data where new entities are frequently introduced. Despite these benefits, this task is currently difficult to study, as existing approaches are e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(14 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…NIL clustering aims at grouping together mentions referring to the same entity. Several algorithms have been proposed, based on GNN ( [8]) or hierarchical clustering ( [9]). Incremental management to add NIL mentions to a KB has been considered in [10]…”
Section: Nil Mentions Managementmentioning
confidence: 99%
“…NIL clustering aims at grouping together mentions referring to the same entity. Several algorithms have been proposed, based on GNN ( [8]) or hierarchical clustering ( [9]). Incremental management to add NIL mentions to a KB has been considered in [10]…”
Section: Nil Mentions Managementmentioning
confidence: 99%
“…Dutta and Weikum [11] explicitly tackle CDC in combination with EL by applying clustering to bag-of-words representations of entity mentions. More recently, Logan IV et al [25] evaluate greedy nearest-neighbour and hierarchical clustering strategies for CDC, however, without explicitly evaluating them with respect to EL.…”
Section: Related Workmentioning
confidence: 99%
“…To produce an initial mention clustering, we follow Logan IV et al [25] and use a greedy nearest-neighbour clustering. Given the mention affinity threshold τ m , the mentions M are grouped into clusters C so that two mentions m, m ∈ M belong to the same cluster if φ(m, m ) > τ m .…”
Section: Cluster Initializationmentioning
confidence: 99%
“…Most of the related work on cross-document IE has focused on coreference resolution task [277][278][279][280][281][282][283][284][285][286][287][288]. This task consists in identifying coreferent mentions on a set of documents given as input.…”
Section: Effect Of Pre-training On New Corporamentioning
confidence: 99%