Dynamic point clouds (DPC) are new media storage formats that allow end-users to watch objects/scenes in a three-dimensional (3D) sense. It can be displayed from different angles throughout time. However, the raw size of a point cloud is huge because there can be millions of points (each containing color triplet and location triplet information) in a point cloud, and there can be multiple point clouds in a DPC. Video-based point cloud compression (V-PCC) is developed to project a 3D point cloud to 2D images: attribute, geometry, and occupancy images. After padding, the 2D images are compressed using the wellestablished high-efficiency video coding (HEVC). In this study, we first employ an occupancy image to propose a blocky occupancy flag (BOF), to denote the occupancy information on "a block basis". For coding attribute and geometry images, we use a BOF to develop a fast coding unit (CU) algorithm for early termination of the CU search recursion. We also utilize the geometry images to calculate the 2D and 3D information of each pixel, for 2D/3D spatial homogeneity of the pixels to design fast CU decision. In addition, we proposed a modified rate-distortion optimization for different color components considering the picture order count (POC) structure in HEVC/V-PCC. Finally, we propose an HEVC input pixel modification method based on a BOF to reduce the unnecessary information to be coded for attribute images. Compared with the state-of-the-art fast V-PCC encoding method, the proposed work outperforms by up to 2.31% in Bjøntegaard delta bit rates (BDBR) (with very slight loss by only up to 0.38%), and improves the time saving performances by up to 7.84% for two different testing datasets.
INDEX TERMSVideo-based point cloud compression (V-PCC), dynamic point cloud (DPC), High Efficiency Video Coding (HEVC), fast coding unit (CU) decision algorithm, occupancy map.
Video compression is an important procedure for digital applications. A commonly used format for inputs to video coders is YUV420, which is a subsampled result from YUV444, that is transformed and demosaicked from a color filter array (CFA) format. The process of chroma subsampling is a crucial step for the reconstructed image quality. A state-ofthe-art method utilizes the solution of a prior method to compute the starting point, and devises a search method to improve the results. However, the search method does not consider the optimality condition of the problem, and the overall computational complexity is high because of the execution of the prior method for the starting point. In the proposed work, the cost function of the optimization problem is analyzed and decomposed into 4 subchannels based on the CFA format. The optimal line of each subterm is studied in 3dimensional space. The problem is then mathematically reduced to a 3-subchannel problem. Based on the combinations of the optimal lines, a triangle search area is formed and proved to contain the optimal solution. And because the derived triangle search area is small on average, the search for the optimal value can be considerably fast. Experimental results demonstrate that the proposed approach can produce reconstructed images with qualities that are exactly the same as those generated using the state-of-the-art method. Relative to the state-of-the-art method, the proposed approach can reduce the time complexity by 84.08%-84.91%, and reduce the number of search points by 40.24%-85.25%, for the image datasets Kodak and IMAX with different demosaicking methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.