Scene parsing is challenging for unrestricted open vocabulary and diverse scenes. In this paper, we exploit the capability of global context information by different-regionbased context aggregation through our pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet). Our global prior representation is effective to produce good quality results on the scene parsing task, while PSPNet provides a superior framework for pixellevel prediction. The proposed approach achieves state-ofthe-art performance on various datasets. It came first in Im-ageNet scene parsing challenge 2016, PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields the new record of mIoU accuracy 85.4% on PASCAL VOC 2012 and accuracy 80.2% on Cityscapes.
The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework. Specifically, we enhance the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path between lower layers and topmost feature. We present adaptive feature pooling, which links feature grid and all feature levels to make useful information in each feature level propagate directly to following proposal subnetworks. A complementary branch capturing different views for each proposal is created to further improve mask prediction.These improvements are simple to implement, with subtle extra computational overhead. Our PANet reaches the 1 st place in the COCO 2017 Challenge Instance Segmentation task and the 2 nd place in Object Detection task without large-batch training. It is also state-of-the-art on MVD and Cityscapes. Code is available at https://github. com/ShuLiu1993/PANet.
Abstract.We discuss a few new motion deblurring problems that are significant to kernel estimation and non-blind deconvolution. We found that strong edges do not always profit kernel estimation, but instead under certain circumstance degrade it. This finding leads to a new metric to measure the usefulness of image edges in motion deblurring and a gradient selection process to mitigate their possible adverse effect. We also propose an efficient and high-quality kernel estimation method based on using the spatial prior and the iterative support detection (ISD) kernel refinement, which avoids hard threshold of the kernel elements to enforce sparsity. We employ the TV-1 deconvolution model, solved with a new variable substitution scheme to robustly suppress noise.
Abstract-Complex structures commonly exist in natural images. When an image contains small-scale high-contrast patterns either in the background or foreground, saliency detection could be adversely affected, resulting erroneous and non-uniform saliency assignment. The issue forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. Different from varying patch sizes or downsizing images, we measure region-based scales. The final saliency values are inferred optimally combining all the saliency cues in different scales using hierarchical inference. Through our inference model, single-scale information is selected to obtain a saliency map. Our method improves detection quality on many images that cannot be handled well traditionally. We also construct an extended Complex Scene Saliency Dataset (ECSSD) to include complex but general natural images.
In single image deblurring, the "coarse-to-fine" scheme, i.e. gradually restoring the sharp image on different resolutions in a pyramid, is very successful in both traditional optimization-based methods and recent neural-networkbased approaches. In this paper, we investigate this strategy and propose a Scale-recurrent Network (SRN-DeblurNet) for this deblurring task. Compared with the many recent learning-based approaches in [25], it has a simpler network structure, a smaller number of parameters and is easier to train. We evaluate our method on large-scale deblurring datasets with complex motion. Results show that our method can produce better quality results than state-of-thearts, both quantitatively and qualitatively.
We show in this paper that the success of previous maximum a posterior (MAP) based blur removal methods partly stems from their respective intermediate steps, which implicitly or explicitly create an unnatural representation containing salient image structures. We propose a generalized and mathematically sound L 0 sparse expression, together with a new effective method, for motion deblurring. Our system does not require extra filtering during optimization and demonstrates fast energy decreasing, making a small number of iterations enough for convergence. It also provides a unified framework for both uniform and non-uniform motion deblurring. We extensively validate our method and show comparison with other approaches with respect to convergence speed, running time, and result quality.
Figure 1High quality single image motion-deblurring. The left sub-figure shows one captured image using a hand-held camera under dim light. It is severely blurred by an unknown kernel. The right sub-figure shows our deblurred image result computed by estimating both the blur kernel and the unblurred latent image. We show several close-ups of blurred/unblurred image regions for comparison. AbstractWe present a new algorithm for removing motion blur from a single image. Our method computes a deblurred image using a unified probabilistic model of both blur kernel estimation and unblurred image restoration. We present an analysis of the causes of common artifacts found in current deblurring methods, and then introduce several novel terms within this probabilistic model that are inspired by our analysis. These terms include a model of the spatial randomness of noise in the blurred image, as well a new local smoothness prior that reduces ringing artifacts by constraining contrast in the unblurred image wherever the blurred image exhibits low contrast. Finally, we describe an efficient optimization scheme that alternates between blur kernel estimation and unblurred image restoration until convergence. As a result of these steps, we are able to produce high quality deblurred results in low computation time. We are even able to produce results of comparable quality to techniques that require additional input images beyond a single blurry photograph, and to methods that require additional hardware.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.