Recently, very deep convolutional neural networks (CNNs) have been attracting considerable attention in image restoration. However, as the depth grows, the long-term dependency problem is rarely realized for these very deep models, which results in the prior states/layers having little influence on the subsequent ones. Motivated by the fact that human thoughts have persistency, we propose a very deep persistent memory network (MemNet) that introduces a memory block, consisting of a recursive unit and a gate unit, to explicitly mine persistent memory through an adaptive learning process. The recursive unit learns multi-level representations of the current state under different receptive fields. The representations and the outputs from the previous memory blocks are concatenated and sent to the gate unit, which adaptively controls how much of the previous states should be reserved, and decides how much of the current state should be stored. We apply MemNet to three image restoration tasks, i.e., image denosing, superresolution and JPEG deblocking. Comprehensive experiments demonstrate the necessity of the MemNet and its unanimous superiority on all three tasks over the state of the arts. Code is available at https://github.com/ tyshiwo/MemNet. the additive noise. With this mathematical model, extensive studies are conducted on many image restoration tasks, e.g., image denoising [2,5,9,37], single-image super-resolution (SISR) [15,38] and JPEG deblocking [18,26].As three classical image restoration tasks, image denoising aims to recover a clean image from a noisy observation, which commonly assumes additive white Gaussian noise with a standard deviation σ; single-image superresolution recovers a high-resolution (HR) image from a low-resolution (LR) image; and JPEG deblocking removes the blocking artifact caused by JPEG compression [7].Recently, due to the powerful learning ability, very deep convolutional neural network (CNN) is widely used to tackle the image restoration tasks. Kim et al. construct a 20-layer CNN structure named VDSR for SISR [20], and adopts residual learning to ease training difficulty. To control the parameter number of very deep models, the authors further introduce a recursive layer and propose a Deeply-Recursive Convolutional Network (DRCN) [21]. To mitegate training difficulty, Mao et al. [27] introduce symmetric skip connections into a 30-layer convolutional auto-encoder
Bicubic Target CoarseSR SRResNet FineSR Bicubic/CoarseSR SRResNet/FineSR Input (Bicubic) Target VDSR URDGN SRResNet FSRNet (Ours) FSRGAN (Ours) 14.9/0.55/1.24 PSNR/SSIM 15.5/0.58/1.38 15.9/0.58/1.18 17.0/0.60/0.95Figure 1: Visual results of different super-resolution methods on scale factor 8. AbstractFace Super-Resolution (SR) is a domain-specific superresolution problem. The specific facial prior knowledge could be leveraged for better super-resolving face images. We present a novel deep end-to-end trainable Face Super-Resolution Network (FSRNet), which makes full use of the geometry prior, i.e., facial landmark heatmaps and parsing maps, to super-resolve very low-resolution (LR) face images without well-aligned requirement. Specifically, we first construct a coarse SR network to recover a coarse high-resolution (HR) image. Then, the coarse HR image is sent to two branches: a fine SR encoder and a prior information estimation network, which extracts the image features, and estimates landmark heatmaps/parsing maps respectively. Both image features and prior information are sent to a fine SR decoder to recover the HR image. To further generate realistic faces, we propose the Face Super-Resolution Generative Adversarial Network (FSRGAN) to incorporate the adversarial loss into FSRNet. Moreover, we introduce two related tasks, face alignment and parsing, as the new evaluation metrics for face SR, which address the inconsistency of classic metrics w.r.t. visual perception.Extensive benchmark experiments show that FSRNet and FSRGAN significantly outperforms state of the arts for very LR face SR, both quantitatively and qualitatively. Code will be made available upon publication.
In this paper, we propose a novel face detection network with three novel contributions that address three key aspects of face detection, including better feature learning, progressive loss design and anchor assign based data augmentation, respectively. First, we propose a Feature Enhance Module (FEM) for enhancing the original feature maps to extend the single shot detector to dual shot detector. Second, we adopt Progressive Anchor Loss (PAL) computed by two different sets of anchors to effectively facilitate the features. Third, we use an Improved Anchor Matching (IAM) by integrating novel anchor assign strategy into data aug-
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.