Abstract. This paper describes the setup of the Book Structure Extraction competition run at ICDAR 2009. The goal of the competition was to evaluate and compare automatic techniques for deriving structure information from digitized books, which could then be used to aid navigation inside the books. More specifically, the task that participants faced was to construct hyperlinked tables of contents for a collection of 1,000 digitized books. This paper describes the setup of the competition and its challenges. It introduces and discusses the book collection used in the task, the collaborative construction of the ground truth, the evaluation measures and the evaluation results. The paper also introduces a data set to be used freely for research evaluation purposes.
This paper introduces the Book Structure Extraction competition run at ICDAR 2009. The goal of the competition is to evaluate and compare automatic techniques for deriving structure information from digitized books, which could then be used to aid navigation inside the books. More specifically, the task that participants are faced with is to construct hyperlinked tables of contents for a collection of 1,000 digitized books. This paper describes the setup of the competition, the book collection used in the task, and the proposed measures for the evaluation. Results of the evaluation will be presented at the ICDAR 2009 conference and will be published in the INEX 2009 proceedings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.