Proceedings of the 6th Conference on Message Understanding - MUC6 '95 1995
DOI: 10.3115/1072399.1072405
|View full text |Cite
|
Sign up to set email alerts
|

A model-theoretic coreference scoring scheme

Abstract: This note describes a scoring scheme for the coreference task in MUC6. It improves o n the original approach l by: (1) grounding the scoring scheme in terms of a model ; (2) producing more intuitive recall and precision scores ; and (3) not requiring explici t computation of the transitive closure of coreference. The principal conceptual differenc e is that we have moved from a syntactic scoring model based on following coreferenc e links to an approach defined by the model theory of those links .

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
257
0
9

Year Published

2002
2002
2018
2018

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 398 publications
(299 citation statements)
references
References 0 publications
1
257
0
9
Order By: Relevance
“…There does not appear to be a single standard evaluation metric in the coreference resolution community. We opted to use the following three: muc-6 [38], ceaf [23], and b-cubed [1], which seem to be the most widely accepted metrics. All three metrics compute Recall, Precision and F-Scores on aligned gold-standard and resolver-tool coreference chains.…”
Section: Automatic Extrinsic Evaluation Of Claritymentioning
confidence: 99%
“…There does not appear to be a single standard evaluation metric in the coreference resolution community. We opted to use the following three: muc-6 [38], ceaf [23], and b-cubed [1], which seem to be the most widely accepted metrics. All three metrics compute Recall, Precision and F-Scores on aligned gold-standard and resolver-tool coreference chains.…”
Section: Automatic Extrinsic Evaluation Of Claritymentioning
confidence: 99%
“…Evaluation Metrics. We compute three most popular performance metrics for coreference resolution: MUC (Vilain et al, 1995), B-Cubed (Bagga and Baldwin, 1998), and Entity-based CEAF (CEAF φ4 ) (Luo, 2005). As it is commonly done in CoNLL shared tasks (Pradhan et al, 2012), we employ the average F1 score (CoNLL F1) of these three metrics for comparison purposes.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…anaphora. We compared our system to this baseline using the unweighted average of F 1 -measure over B-CUBED (Bagga and Baldwin, 1998), MUC (Vilain et al, 1995), and CEAF (Luo, 2005) metrics, the standard evaluation metrics for coreference resolution. We used the scripts provided by i2b2 shared task organizers for this purpose.…”
Section: Discussionmentioning
confidence: 99%