Abstract. This paper describes principles for evaluation metrics for lexical components and an implementation of them based on requirements from practical information systems.
Evaluating information system componentsThe performance of a component in a complex processing pipeline can influence the function of downstream components, meaning that end-to-end testing also must be performed on entire systems, using approaches based on use cases with target notions that validate the function of the system for the purpose it is built, such as many of the evaluation measures formulated in workshops at CLEF. But a task-based evaluation does not reveal the performance of individual components. Evaluation of knowledge-based components in an information system should be done systematically, ideally in ways which are similar to unit tests done for other technical components, motivated by the need for a development and maintenance team to: