Abstract:Abstract.Although the internet offers a wide-spread platform for information interchange, day-to-day work in large companies still means the processing of tens of thousands of printed documents every day. This paper presents the system smartFIX which is a document analysis and understanding system deve loped by the DFKI spin-off INSIDERS. It permits the processing of documents ranging from fixed format forms to unstructured letters of any format. Apart from the architecture, the main components and system char… Show more
“…In particular, special operations to edit for both bitmap and vectorial images are supported, such as copy and paste of bitmap images, creation and modification of vectorization components, etc. The interface is implemented in C++ using the Qt framework 4 and is distributed under the QPL license. This makes it easy to be upgraded or customized by anyone interested in its features.…”
Section: The Qgargui User Interfacementioning
confidence: 99%
“…For instance, there is good know-how in building flexible and versatile OCR and page reader systems, taking into account segmentation, feature extraction, classification, and various linguistic post-processing steps [3]. Extending the concept to various business documents, Dengel et al have also developed a very mature system, capable of adaptation to different types of documents [4,5]. Other specific application domains have their mature systems, such as bank check processing software [6], table recognition [7], or forms processing [8].…”
Abstract. This paper presents the main design and development issues of the Qgar software environment for graphics recognition applications. We aim at providing stable and robust implementations of state-of-theart methods and algorithms, within an intuitive and user-friendly environment. The resulting software system is open, so that our applications can be easily interfaced with other systems, and, conversely, that thirdparty applications can be "plugged" into our environment with little effort. The paper also presents a quick tour of the various components of the Qgar environment, and concentrates on the usefulness of this kind of system for testing and evaluation purposes.
“…In particular, special operations to edit for both bitmap and vectorial images are supported, such as copy and paste of bitmap images, creation and modification of vectorization components, etc. The interface is implemented in C++ using the Qt framework 4 and is distributed under the QPL license. This makes it easy to be upgraded or customized by anyone interested in its features.…”
Section: The Qgargui User Interfacementioning
confidence: 99%
“…For instance, there is good know-how in building flexible and versatile OCR and page reader systems, taking into account segmentation, feature extraction, classification, and various linguistic post-processing steps [3]. Extending the concept to various business documents, Dengel et al have also developed a very mature system, capable of adaptation to different types of documents [4,5]. Other specific application domains have their mature systems, such as bank check processing software [6], table recognition [7], or forms processing [8].…”
Abstract. This paper presents the main design and development issues of the Qgar software environment for graphics recognition applications. We aim at providing stable and robust implementations of state-of-theart methods and algorithms, within an intuitive and user-friendly environment. The resulting software system is open, so that our applications can be easily interfaced with other systems, and, conversely, that thirdparty applications can be "plugged" into our environment with little effort. The paper also presents a quick tour of the various components of the Qgar environment, and concentrates on the usefulness of this kind of system for testing and evaluation purposes.
“…(cf. [1,2,5]) However, in ,56 it is sensible to simply consider the creditor as the classification. Then, as explained above, the creditor can be used to retrieve from an online ERP-connection the list of possible invoice line items.…”
Section: Classification and Page Collationmentioning
confidence: 99%
“…Our consortium of four companies (their specific experiences are each in brackets) with experience in the ,56 field, interim2000 (documents analysis marketing and consulting), Pylon (document process consulting), Integra (market analysis), and DFKI (document analysis science, [1,2,3]), detected the emergence of a demand for invoice reading systems (,56) on the German market. Eleven companies supply systems for invoice reading, see Table 1.…”
Abstract. Companies order, receive, and pay for goods. Hence they continually receive and process invoices. For the most part these are printed on paper and are dealt with manually, so that each invoice after receipt involves processing costs of about 9 Euro on average. Often, human searching and typing of data into computer forms is required to transfer the information from paper into the computer, e.g. into ERP-systems, like SAP, that many companies run. This article presents the main results of our 300-page market survey of 11 suppliers of invoice reading systems (,56), which automate the transfer of invoice data to ERP-systems. For the scientific ,56 community we hope to provide the service of a better visibility of our discipline to potential investors and users.
“…Some REGEX spotting systems have been designed for electronical documents, using Natural Language Processing methods. 1,2 In this case, the REGEX spotting is rather straightforward as it consists in applying exact string matching methods on the ASCII text. When dealing with document images, a recognition step is needed in order to produce the ASCII transcription before processing the input data.…”
In this article, we propose a hybrid model for spotting words and regular expressions (REGEX) in handwritten documents. The model is made of the state-of-the-art BLSTM (Bidirectional Long Short Time Memory) neural network for recognizing and segmenting characters, coupled with a HMM to build line models able to spot the desired sequences. Experiments on the Rimes database show very promising results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.