2011
DOI: 10.2478/v10108-011-0011-4
|View full text |Cite
|
Sign up to set email alerts
|

Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output

Abstract: We describe Hjerson, a tool for automatic classification of errors in machine translation output. The tool features the detection of five word level error classes: morphological errors, reodering errors, missing words, extra words and lexical errors. As input, the tool requires original full form reference translation(s) and hypothesis along with their corresponding base forms. It is also possible to use additional information on the word level (e.g.  tags) in order to obtain more details. The tool provides… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
38
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 44 publications
(38 citation statements)
references
References 7 publications
0
38
0
Order By: Relevance
“…In this experiment we assess the performance of NMT versus PBMT systems on a set of error categories that correspond to five word-level error classes: inflection errors, reordering errors, missing words, extra words and incorrect lexical choices. These errors are detected automatically using the edit distance, word error rate (WER), precision-based and recall-based position- independent error rates (hPER and rPER, respectively) as implemented in Hjerson (Popović, 2011). These error classes are then defined as follows:…”
Section: Error Categoriesmentioning
confidence: 99%
“…In this experiment we assess the performance of NMT versus PBMT systems on a set of error categories that correspond to five word-level error classes: inflection errors, reordering errors, missing words, extra words and incorrect lexical choices. These errors are detected automatically using the edit distance, word error rate (WER), precision-based and recall-based position- independent error rates (hPER and rPER, respectively) as implemented in Hjerson (Popović, 2011). These error classes are then defined as follows:…”
Section: Error Categoriesmentioning
confidence: 99%
“…A taxonomy which has been popular in MT is (Vilar et al, 2006). To avoid the necessity of calling in human evaluators every time an error analysis is to be performed there have also been work on automatic error classification (Popović and Burchardt, 2011). While simply counting errors seems less relevant for comparing machine translation to human translation, showing what type of errors occur can be useful.…”
Section: Mt and Translation Studiesmentioning
confidence: 99%
“…Due to the variation of language, ambiguity, etc., checking and evaluating MT output can be almost as difficult as the translation itself. Still, people have tried to automatically classify errors comparing MT output to reference translations or post-edited MT output using tools like Hjerson (Popovic, 2011).…”
Section: Introductionmentioning
confidence: 99%