Hongwei Yong scite author profile

Most of the existing learning-based single image super-resolution (SISR) methods are trained and evaluated on simulated datasets, where the low-resolution (LR) images are generated by applying a simple and uniform degradation (i.e., bicubic downsampling) to their high-resolution (HR) counterparts. However, the degradations in real-world LR images are far more complicated. As a consequence, the SISR models trained on simulated data become less effective when applied to practical scenarios. In this paper, we build a real-world super-resolution (RealSR) dataset where paired LR-HR images on the same scene are captured by adjusting the focal length of a digital camera. An image registration algorithm is developed to progressively align the image pairs at different resolutions. Considering that the degradation kernels are naturally non-uniform in our dataset, we present a Laplacian pyramid based kernel prediction network (LP-KPN), which efficiently learns per-pixel kernels to recover the HR image. Our extensive experiments demonstrate that SISR models trained on our RealSR dataset deliver better visual quality with sharper edges and finer textures on real-world scenes than those trained on simulated datasets. Though our RealSR dataset is built by using only two cameras (Canon 5D3 and Nikon D810), the trained model generalizes well to other camera devices such as Sony a7II and mobile phones.

show abstract

Waterloo Exploration Database: New Challenges for Image Quality Assessment Models

Duanmu

et al. 2017

IEEE Trans. on Image Process.

569

264

View full text Add to dashboard Cite

The great content diversity of real-world digital images poses a grand challenge to image quality assessment (IQA) models, which are traditionally designed and validated on a handful of commonly used IQA databases with very limited content variation. To test the generalization capability and to facilitate the wide usage of IQA techniques in real-world applications, we establish a large-scale database named the Waterloo Exploration Database, which in its current state contains 4744 pristine natural images and 94 880 distorted images created from them. Instead of collecting the mean opinion score for each image via subjective testing, which is extremely difficult if not impossible, we present three alternative test criteria to evaluate the performance of IQA models, namely, the pristine/distorted image discriminability test, the listwise ranking consistency test, and the pairwise preference consistency test (P-test). We compare 20 well-known IQA models using the proposed criteria, which not only provide a stronger test in a more challenging testing environment for existing models, but also demonstrate the additional benefits of using the proposed database. For example, in the P-test, even for the best performing no-reference IQA model, more than 6 million failure cases against the model are "discovered" automatically out of over 1 billion test pairs. Furthermore, we discuss how the new database may be exploited using innovative approaches in the future, to reveal the weaknesses of existing IQA models, to provide insights on how to improve the models, and to shed light on how the next-generation IQA models may be developed. The database and codes are made publicly available at: https://ece.uwaterloo.ca/~k29ma/exploration/.

show abstract

Robust Multi-Exposure Image Fusion: A Structural Patch Decomposition Approach

Yong

et al. 2017

IEEE Trans. on Image Process.

292

243

View full text Add to dashboard Cite

We propose a simple yet effective structural patch decomposition approach for multi-exposure image fusion (MEF) that is robust to ghosting effect. We decompose an image patch into three conceptually independent components: signal strength, signal structure, and mean intensity. Upon fusing these three components separately, we reconstruct a desired patch and place it back into the fused image. This novel patch decomposition approach benefits MEF in many aspects. First, as opposed to most pixel-wise MEF methods, the proposed algorithm does not require post-processing steps to improve visual quality or to reduce spatial artifacts. Second, it handles RGB color channels jointly, and thus produces fused images with more vivid color appearance. Third and most importantly, the direction of the signal structure component in the patch vector space provides ideal information for ghost removal. It allows us to reliably and efficiently reject inconsistent object motions with respect to a chosen reference image without performing computationally expensive motion estimation. We compare the proposed algorithm with 12 MEF methods on 21 static scenes and 12 deghosting schemes on 19 dynamic scenes (with camera and object motion). Extensive experimental results demonstrate that the proposed algorithm not only outperforms previous MEF algorithms on static scenes but also consistently produces high quality fused images with little ghosting artifacts for dynamic scenes. Moreover, it maintains a lower computational cost compared with the state-of-the-art deghosting schemes.

show abstract

Robust Online Matrix Factorization for Dynamic Background Subtraction

Yong¹,

Meng²,

Zuo

et al. 2018

IEEE Trans. Pattern Anal. Mach. Intell.

143

View full text Add to dashboard Cite

Abstract-We propose an effective online background subtraction method, which can be robustly applied to practical videos that have variations in both foreground and background. Different from previous methods which often model the foreground as Gaussian or Laplacian distributions, we model the foreground for each frame with a specific mixture of Gaussians (MoG) distribution, which is updated online frame by frame. Particularly, our MoG model in each frame is regularized by the learned foreground/background knowledge in previous frames. This makes our online MoG model highly robust, stable and adaptive to practical foreground and background variations. The proposed model can be formulated as a concise probabilistic MAP model, which can be readily solved by EM algorithm. We further embed an affine transformation operator into the proposed model, which can be automatically adjusted to fit a wide range of video background transformations and make the method more robust to camera movements. With using the sub-sampling technique, the proposed method can be accelerated to execute more than 250 frames per second on average, meeting the requirement of real-time background subtraction for practical video processing tasks. The superiority of the proposed method is substantiated by extensive experiments implemented on synthetic and real videos, as compared with state-of-the-art online and offline background subtraction methods.

show abstract

Gradient Centralization: A New Optimization Technique for Deep Neural Networks

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hongwei Yong

Toward Real-World Single Image Super-Resolution: A New Benchmark and a New Model

Waterloo Exploration Database: New Challenges for Image Quality Assessment Models

Robust Multi-Exposure Image Fusion: A Structural Patch Decomposition Approach

Robust Online Matrix Factorization for Dynamic Background Subtraction

Gradient Centralization: A New Optimization Technique for Deep Neural Networks

Contact Info

Product

Resources

About