Since most 2D videos are initially captured in 3D environments, developing 3D motion models for video coding is beneficial. This paper introduces a method for extracting 3D geometry data from 2D videos and synthesizing 3D-based virtual Reference Pictures (RPs). These novel RPs are offered to the Versatile Video Coding (VVC) encoder for use in motion compensation. The proposed method generates 3D geometry data in the form of 3D meshes at the encoder and transmits it to the decoder as an overhead bitstream. However, this overhead could eat up the whole coding gain or even make it worse than the anchor VVC. This paper solves this problem by employing 3D mesh processing techniques, e.g., mesh denoising, mesh decimation, and mesh compression. Simulation results show that the proposed method outperforms VVC up to ~3.8%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.