DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks

Das, Sagnik; Ma, Ke; Shu, Zhixin; Samaras, Dimitris; Shilkrot, Roy

doi:10.1109/iccv.2019.00022

Cited by 55 publications

(170 citation statements)

References 36 publications

Supporting

Mentioning

168

Contrasting

Order By: Relevance

“…Our implementation takes around 0.67 to 0.72 seconds to process a 1024x960 image. We compare our results with Ma et al [15] and Das and Ma et al [6] on the real-world document images. Compared with previous method, our proposal can rectify various distortions while removing background and replace it to transparent (the visual comparison is shown in Fig.…”

Section: Experimental Setup and Resultsmentioning

confidence: 85%

“…Because of the generated dataset is quite different from the real-world image, [15] trained on its dataset has worse generalization when tested on real-world images. Das and Ma et al [6] think dewarping model was not always perform well when trained by the synthetic training dataset only used 2D deformation, so they created a Doc3D dataset which has multiple types of pixel-wise document image ground truth by using both real-world document and rendering software. Meanwhile, [6] proposed a dewarping network and refinement network to correct geometric and shading of document images.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Dewarping Document Image by Displacement Flow Estimation with Fully Convolutional Network

Xie

Yin

Zhang

et al. 2020

Document Analysis Systems

View full text Add to dashboard Cite

As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance. In this paper, we propose a novel framework for both rectifying distorted document image and removing background finely, by estimating pixel-wise displacements using a fully convolutional network (FCN). The document image is rectified by transformation according to the displacements of pixels. The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Our approach is easy to implement and consumes moderate computing resource. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect. Our code and trained models are available at https://github.com/gwxie/Dewarping-Document-Image-By-Displacement-Flow-Estimation.

show abstract

Section: Experimental Setup and Resultsmentioning

confidence: 85%

Section: Related Workmentioning

confidence: 99%

Dewarping Document Image by Displacement Flow Estimation with Fully Convolutional Network

Xie

Yin

Zhang

et al. 2020

Document Analysis Systems

View full text Add to dashboard Cite

show abstract

“…Extensive experiments on several datasets, i.e., Doc3D, DRIC, and DocUNet dataset, demonstrate the effectiveness and superiority of our DocTr over the existing stateof-the-art methods on both tasks. Notably, on DocUNet benchmark [22], we achieve significant improvement on OCR results (absolutely 15.32% Character Error Rate (CER) reduced compared to the state-of-the-art method [7]). Furthermore, our method shows high efficiency on inference time and parameter count.…”

Section: Introductionmentioning

confidence: 92%

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction

Feng¹,

Wang²,

Zhou³

et al. 2021

Proceedings of the 29th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

“…Therefore, learning-based methods using only a single distorted image are being pursued [ 10 , 11 , 12 , 42 , 43 , 44 , 45 , 46 ]. Deep learning for correcting documents were proposed recently [ 12 , 44 , 45 , 46 ] which implements convolutional neural networks, encoder-decoders, and U-net-based architectures [ 47 ]. Work on correcting portrait images used an encoder-decoder architecture [ 10 ].…”

Section: Related Workmentioning

confidence: 99%

Blind First-Order Perspective Distortion Correction Using Parallel Convolutional Neural Networks

Gallego

Ilao

Cordel

2020

Sensors

View full text Add to dashboard Cite

In this work, we present a network architecture with parallel convolutional neural networks (CNN) for removing perspective distortion in images. While other works generate corrected images through the use of generative adversarial networks or encoder-decoder networks, we propose a method wherein three CNNs are trained in parallel, to predict a certain element pair in the 3×3 transformation matrix, M^. The corrected image is produced by transforming the distorted input image using M^−1. The networks are trained from our generated distorted image dataset using KITTI images. Experimental results show promise in this approach, as our method is capable of correcting perspective distortions on images and outperforms other state-of-the-art methods. Our method also recovers the intended scale and proportion of the image, which is not observed in other works.

show abstract

DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks

Cited by 55 publications

References 36 publications

Dewarping Document Image by Displacement Flow Estimation with Fully Convolutional Network

Dewarping Document Image by Displacement Flow Estimation with Fully Convolutional Network

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction

Blind First-Order Perspective Distortion Correction Using Parallel Convolutional Neural Networks

Contact Info

Product

Resources

About