2019
DOI: 10.1002/pra2.49
|View full text |Cite
|
Sign up to set email alerts
|

Introducing orbis: An extendable evaluation pipeline for named entity linking performance drill‐down analyses

Abstract: Most current evaluation tools are focused solely on benchmarking and comparative evaluations thus only provide aggregated statistics such as precision, recall and F1‐measure to assess overall system performance. They do not offer comprehensive analyses up to the level of individual annotations. This paper introduces Orbis, an extendable evaluation pipeline framework developed to allow visual drill‐down analyses of individual entities, computed by annotation services, in the context of the text they appear in, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…Appropriate benchmarking suites and gold standard data are key towards evaluating content extraction methods, identifying their strengths and weaknesses. We, therefore, have created a gold standard dataset that is used in conjunction with the Open Source Orbis benchmarking framework [23] to evaluate Harvest's performance.…”
Section: Discussionmentioning
confidence: 99%
“…Appropriate benchmarking suites and gold standard data are key towards evaluating content extraction methods, identifying their strengths and weaknesses. We, therefore, have created a gold standard dataset that is used in conjunction with the Open Source Orbis benchmarking framework [23] to evaluate Harvest's performance.…”
Section: Discussionmentioning
confidence: 99%
“…GERBIL (Röder et al, 2018) standardizes ED evaluation over multiple datasets in a unifying framework, but does not define the training data and thus only focuses on comparing already-trained models. Similarly, a range of prior works have sought to refine and standardize ED evaluation (Waitelonis et al, 2019;Nait-Hamoud et al, 2021;Noullet et al, 2021;Odoni et al, 2019;van Erp and Groth, 2020;Braşoveanu et al, 2018). In contrast, ZELDA defines the full experimental setup, including training data, the entity vocabulary and other training signals.…”
Section: Related Workmentioning
confidence: 99%
“…Future work will focus on: (i) improving the slot filling performance by enhancing page segmentation, increasing the coverage of the proprietary knowledge graph used for entity linking, and fine-tuning the entity recognition component. Given the importance of the created benchmarking framework for the research and development process, we plan on (ii) further increasing its size and coverage; and (iii) integrating the gold standard with explainable benchmarking frameworks such as Orbis [9] to make it more accessible to third-party researchers.…”
Section: Outlook and Conclusionmentioning
confidence: 99%