This paper describes a new system for extracting and classifying bibliography regions from the color image of a book cover. The system consists of three major components: preprocessing, color space segmentation and text region extraction and classification.Preprocessing extracts the edge lines of the book and geometrically corrects and segments the input image, into the parts of front cover, spine and back cover.The same as all color image processing researches, the segmentation of color space is an essential and important step here. Instead of RGB color space, HSI color space is used in this system. The color space is segmented into achromatic and chromatic regions first; and both the achromatic and chromatic regions are segmented further to complete the color space segmentation.Then text region extraction and classification follow. After detecting fundamental features (stroke width and local label width) text regions are determined. By comparing the text regions on front cover with those on spine, all extracted text regions are classified into suitable bibliography categories: author, title, publisher and other information, without applying OCR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.