Proceedings of the 4th Conference on Message Understanding - MUC4 '92 1992
DOI: 10.3115/1072064.1072067
|View full text |Cite
|
Sign up to set email alerts
|

MUC-4 evaluation metrics

Abstract: INTRODUCTION The MUC-4 evaluation metrics measure the performance of the message understanding systems. This paper describes the scoring algorithms used to arrive at the metrics as well as the improvements that were made to th e MUC-3 methods. MUC-4 evaluation metrics were stricter than those used in MUC-3. Given the differences in scoring between MUC-3 and MUC-4, the MUC-4 systems' scores represent a larger improvement over MUC-3 performance than the numbers themselves suggest. The major improvements in the s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
238
0
3

Year Published

1993
1993
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 467 publications
(278 citation statements)
references
References 1 publication
1
238
0
3
Order By: Relevance
“…Podem-se estabelecer alguns pesos diferentes para cada medida (Precisão e Abrangência), dando flexibilidade para a definição de critérios de importância (Chinchor, 1992;Sasaki, 2007). Chinchor (1992) define a Média-F pela Equação 3.…”
Section: Média-funclassified
“…Podem-se estabelecer alguns pesos diferentes para cada medida (Precisão e Abrangência), dando flexibilidade para a definição de critérios de importância (Chinchor, 1992;Sasaki, 2007). Chinchor (1992) define a Média-F pela Equação 3.…”
Section: Média-funclassified
“…Yellow Page style). We have carried out evaluation of this application using traditional IE metrics [8,22]: precision, recall, and f-score. An expert manually annotated 5 documents and we compared the results of the system annotations against this gold standard set.…”
Section: Fig 4 Obie For International Company Intelligencementioning
confidence: 99%
“…The F-measure provides a way of combining recall 429 and prediction to get a single measure which falls between recall 430 and precision. Thus, the F-measure is calculated as the harmonic 431 mean of precision and recall and tends towards the lower of the 432 two (Chinchor, 1992):…”
mentioning
confidence: 99%