“…The weighted average method is the most commonly used grayscale method [16]. The following is the formula for the weighted average method: In some special cases, the obtained images may have problems such as angular tilt, unclear images, noise, or information loss [14], so before performing character recognition, it is necessary to pre-process the image to improve the accuracy of subsequent recognition. Common pre-processing operations include geometric transformation, image grayscale, binarization, denoising, etc.…”
In the development of web systems, data uploading is a relatively important function. The traditional method of uploading data is to manually fill out forms, but when the data to be uploaded mostly exist in the form of form images, and the form content contains a lot of similar field information and irrelevant edge information, using traditional methods is not only time-consuming and labor-intensive, but also prone to errors. This requires a technology that can automatically fill in complex form images. OCR is an optical character recognition technology that can convert images into digitized text data using computer vision methods. However, using this technology alone cannot complete the tasks of extracting relevant data and filling corresponding fields. To address this issue, this article proposes a method that combines OCR technology and Levenshtein multi-text similarity. This method can effectively solve the problem of data filling after parsing complex form images, and the application results of this method in web systems show that the filling accuracy for complex form images can reach over 90%.
“…The weighted average method is the most commonly used grayscale method [16]. The following is the formula for the weighted average method: In some special cases, the obtained images may have problems such as angular tilt, unclear images, noise, or information loss [14], so before performing character recognition, it is necessary to pre-process the image to improve the accuracy of subsequent recognition. Common pre-processing operations include geometric transformation, image grayscale, binarization, denoising, etc.…”
In the development of web systems, data uploading is a relatively important function. The traditional method of uploading data is to manually fill out forms, but when the data to be uploaded mostly exist in the form of form images, and the form content contains a lot of similar field information and irrelevant edge information, using traditional methods is not only time-consuming and labor-intensive, but also prone to errors. This requires a technology that can automatically fill in complex form images. OCR is an optical character recognition technology that can convert images into digitized text data using computer vision methods. However, using this technology alone cannot complete the tasks of extracting relevant data and filling corresponding fields. To address this issue, this article proposes a method that combines OCR technology and Levenshtein multi-text similarity. This method can effectively solve the problem of data filling after parsing complex form images, and the application results of this method in web systems show that the filling accuracy for complex form images can reach over 90%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.