Abstract:A new convolution neural network (CNN) architecture for semantic segmentation of high resolution aerial imagery is proposed in this paper. The proposed architecture follows an hourglass-shaped network (HSN) design being structured into encoding and decoding stages. By taking advantage of recent advances in CNN designs, we use the composed inception module to replace common convolutional layers, providing the network with multi-scale receptive areas with rich context. Additionally, in order to reduce spatial ambiguities in the up-sampling stage, skip connections with residual units are also employed to feed forward encoding-stage information directly to the decoder. Moreover, overlap inference is employed to alleviate boundary effects occurring when high resolution images are inferred from small-sized patches. Finally, we also propose a post-processing method based on weighted belief propagation to visually enhance the classification results. Extensive experiments based on the Vaihingen and Potsdam datasets demonstrate that the proposed architectures outperform three reference state-of-the-art network designs both numerically and visually.
Several techniques based on the three-dimensional (3-D) discrete cosine transform (DCT) have been proposed for volumetric data coding. These techniques fail to provide lossless coding coupled with quality and resolution scalability, which is a significant drawback for medical applications. This paper gives an overview of several state-of-the-art 3-D wavelet coders that do meet these requirements and proposes new compression methods exploiting the quadtree and block-based coding concepts, layered zero-coding principles, and context-based arithmetic coding. Additionally, a new 3-D DCT-based coding scheme is designed and used for benchmarking. The proposed wavelet-based coding algorithms produce embedded data streams that can be decoded up to the lossless level and support the desired set of functionality constraints. Moreover, objective and subjective quality evaluation on various medical volumetric datasets shows that the proposed algorithms provide competitive lossy and lossless compression results when compared with the state-of-the-art.
Abstract-In this paper, we describe encryption possibilities for the High Efficiency Video Coding (HEVC) standard under development. Bitstream elements which maintain HEVC compatibility after encryption are listed and their impact on video adaptation is described. From this list, three bitstream elements are selected, namely intra prediction mode difference, motion vector difference sign, and residual sign. These elements provide good protection of the video information and result in 0.0% Bjøntegaard delta bitrate increase because of their equal probability entropy encoding property.
When compared to conventional 2-D video, multiview video can significantly enhance the visual 3-D experience in 3-D applications by offering horizontal parallax. However, when processing images originating from different views, it is common that the colors between the different cameras are not well-calibrated. To solve this problem, a novel energy function-based color correction method for multiview camera setups is proposed to enforce that colors are as close as possible to those in the reference image but also that the overall structural information is well-preserved. The proposed system introduces a spatio-temporal correspondence matching method to ensure that each pixel in the input image gets bijectively mapped to a reference pixel. By combining this mapping with the original structural information, we construct a global optimization algorithm in a Laplacian matrix formulation and solve it using a sparse matrix solver. We further introduce a novel forward-reverse objective evaluation model to overcome the problem of lack of ground truth in this field. The visual comparisons are shown to outperform state-of-the-art multiview color correction methods, while the objective evaluation reports PSNR gains of up to 1.34 dB and SSIM gains of up to 3.2%, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.