Conventional multi-view stereo (MVS) approaches based on photo-consistency measures are generally robust, yet often fail in calculating valid depth pixel estimates in low textured areas of the scene. In this study, a novel approach is proposed to tackle this challenge by leveraging semantic priors into a PatchMatch-based MVS in order to increase confidence and support depth and normal map estimation. Semantic class labels on image pixels are used to impose class-specific geometric constraints during multiview stereo, optimising the depth estimation on weakly supported, textureless areas, commonly present in urban scenarios of building facades, indoor scenes, or aerial datasets. Detecting dominant shapes, e.g., planes, with RANSAC, an adjusted cost function is introduced that combines and weighs both photometric and semantic scores propagating, thus, more accurate depth estimates. Being adaptive, it fills in apparent information gaps and smoothing local roughness in problematic regions while at the same time preserves important details. Experiments on benchmark and custom datasets demonstrate the effectiveness of the presented approach.
The paper investigates the novel concept of local-error control in mesh geometry encoding. In contrast to traditional mesh-coding systems that use the mean-square error as target distortion metric, this paper proposes a new L-infinite mesh-coding approach, for which the target distortion metric is the L-infinite distortion. In this context, a novel wavelet-based L-infinite-constrained coding approach for meshes is proposed, which ensures that the maximum error between the vertex positions in the original and decoded meshes is lower than a given upper bound. Furthermore, the proposed system achieves scalability in L-infinite sense, that is, any decoding of the input stream will correspond to a perfectly predictable L-infinite distortion upper bound. An instantiation of the proposed L-infinite-coding approach is demonstrated for MESHGRID, which is a scalable 3D object encoding system, part of MPEG-4 AFX. In this context, the advantages of scalable L-infinite coding over L-2-oriented coding are experimentally demonstrated. One concludes that the proposed L-infinite mesh-coding approach guarantees an upper bound on the local error in the decoded mesh, it enables a fast real-time implementation of the rate allocation, and it preserves all the scalability features and animation capabilities of the employed scalable mesh codec.
This paper proposes a new approach for joint source and channel coding (JSCC) of meshes, simultaneously providing scalability and optimized resilience against transmission errors. An unequal error protection approach is followed, to cope with the different error-sensitivity levels characterizing the various resolution and quality layers produced by the input scalable source codec. The number of layers and the protection levels to be employed for each layer are determined by solving a joint source and channel coding problem. In this context, a novel fast algorithm for solving the optimization problem is conceived, enabling a real-time implementation of the JSCC rate-allocation. An instantiation of the proposed JSCC approach is demonstrated for MeshGrid, which is a scalable 3-D object representation method, part of MPEG-4 AFX. In this context, the L-infinite distortion metric is employed, which is to our knowledge a unique feature in mesh coding. Numerical results show the superiority of the L-infinite norm over the classical L-2 norm in a JSCC setting. One concludes that the proposed joint source and channel coding approach offers resilience against transmission errors, provides graceful degradation, enables a fast real-time implementation, and preserves all the scalability features and animation capabilities of the employed scalable mesh codec.Index Terms-Error resilient coding, joint source and channel coding of meshes, L-infinite coding, MeshGrid, three-dimensional (3-D) graphics, unequal error protection.
The heterogeneous nature of modern communications stems from the need of transmitting digital information through various types of mediums to a large variety of end-user terminals. In this context, simultaneously providing a scalable source representation and resilience against transmission errors is of primary importance. MESHGRID, which is part of the MPEG-4 AFX standard, is a scalable 3D object representation method especially designed to address the heterogeneous nature of networks and clients in modern communication systems. A MESHGRID object comprises one or several surface layers attached to and located within a volumetric reference-grid. In this paper we focus on the errorresilience aspects of MESHGRID and propose a novel approach for scalable error-resilient coding of MESHGRID's reference-grid. An unequal error protection approach is followed, to acquaint for the different error-sensitivity levels characterizing the various resolution and quality layers produced by the reference-grid coder. The code rates to be employed for each layer are determined by solving a joint source and channel coding problem. The L-infinite distortion metric is employed instead of the classical L-2 norm, typically used in case of images and video. In this context, a novel fast algorithm for solving the optimization problem is proposed. The proposed approach allows for real-time implementations. The experimental results demonstrate the benefits brought by error resilient coding of the reference grid. We conclude that the proposed approach offers resilience against transmission errors while preserving all the scalability features and animation capabilities that characterize MESHGRID.
Abstract-Our recently proposed wavelet-based L-infiniteconstrained coding approach for meshes ensures that the maximum error between the vertex positions in the original and decoded meshes is guaranteed to be lower than a given upper bound. Instantiations of both L-2 and L-infinite coding approaches are demonstrated for MESHGRID, which is a scalable 3D object encoding system, part of MPEG-4 AFX. In this survey paper, we compare the novel L-infinite distortion estimator against the L-2 distortion estimator which is typically employed in 3D mesh coding systems. In addition, we show that, under certain conditions, the L-infinite estimator can be exploited to approximate the Hausdorff distance in real-time implementations.Index Terms -Distortion metric, L-infinite, L-2, MAXAD, MSE, Hausdorff, 3D mesh coding I. INTRODUCTION The diversification of content and the increasing demand in mobility has led to a proliferation of heterogeneous terminals, with diverse capabilities. Efficient storage and transmission of digital data is therefore a critical problem, which can be solved by compressing the original data based on some predefined criteria.There is a broad range of applications (e.g. in the medical area), where compact coding cannot come at the expense of information loss. A viable solution in this case is given by lossless coding, possibly coupled with multi-functionality support, such as scalability and progressive (lossy-to-lossless) reconstruction of the input data. Lossless coding is downsized however by the fairly low achievable lossless compression ratios. There are other applications, such as those in the field of remote sensing, where one can accept information loss in favor of higher compression ratios, provided that the distortions incurred are rigorously bounded. In such applications, lossy or near-lossless compression are suitable, but an appropriate distortion measure needs to be employed in order to accurately quantify and control the distortion incurred by the compression system.The ideal distortion metric in lossy coding of meshes is the Hausdorff distance, as this metric expresses the maximum local error between the original and decoded meshes. However, to compute the Hausdorff distance, considerable processing power and memory space are needed, in particular for high resolution meshes. This becomes even more critical in scalable mesh coding systems, where, in order allocate rate, one needs to determine the Hausdorff distance for all possible
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.