Extraction of text from an image document is one of the challenges faced by nowadays. The paper focuses on the problem of text detection from scanned image documents for the improvements in its conventional techniques. The paper uses the Optical Character Recognition (OCR) technique in order to extract actual text presented in the input image. Other techniques such as Maximally Stable Extremal Region (MSER) to estimate the scales and orientation CC extraction which is used as an algorithm in order to enhance the extraction and retrieval process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.