Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems 2021
DOI: 10.18653/v1/2021.eval4nlp-1.5
|View full text |Cite
|
Sign up to set email alerts
|

SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation

Abstract: To address a looming crisis of unreproducible evaluation for named entity recognition, we propose guidelines and introduce SeqScore, a software package to improve reproducibility. The guidelines we propose are extremely simple and center around transparency regarding how chunks are encoded and scored. We demonstrate that despite the apparent simplicity of NER evaluation, unreported differences in the scoring procedure can result in changes to scores that are both of noticeable magnitude and statistically signi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 28 publications
0
4
0
Order By: Relevance
“…This section describes the features of SeqScore, focusing on the newest features that enable it to assist in many NER data workflows. Previous work (Palen-Michel et al, 2021) has described the scoring features of SeqScore, so they are not discussed in detail in this paper. SeqScore is released via PyPI (https://pypi.org/project/seqscore/) and development occurs on GitHub (https:// github.com/bltlab/seqscore).…”
Section: Seqscore's Featuresmentioning
confidence: 99%
See 2 more Smart Citations
“…This section describes the features of SeqScore, focusing on the newest features that enable it to assist in many NER data workflows. Previous work (Palen-Michel et al, 2021) has described the scoring features of SeqScore, so they are not discussed in detail in this paper. SeqScore is released via PyPI (https://pypi.org/project/seqscore/) and development occurs on GitHub (https:// github.com/bltlab/seqscore).…”
Section: Seqscore's Featuresmentioning
confidence: 99%
“…Seq-Score supports several options to work with a wide variety of data files: setting the file encoding (older files often use ISO-8859-1), ignoring comment lines (which some files use for sentence provenance information), and automatic detection of field delimiters (older files use space, newer ones use tabs). Different strategies can be set regarding how to deal with invalid label transitions like O I-PER in BIO (for more details see Palen-Michel et al, 2021). SeqScore can maintain or discard the document boundaries specified using -DOCSTART-sentences inside CoNLL-format files, which enables scoring a reference with document boundaries against system output without them.…”
Section: Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…Evaluation for all models required extracted spans to match the annotation exactly in span and type to be correct. Evaluation was performed with SeqScore (Palen-Michel et al, 2021), using conlleval-style repair for invalid label sequences. All models were trained using an AMD 2990WX CPU and a single RTX 2080 Ti GPU.…”
Section: Licensingmentioning
confidence: 99%