This paper provides a technical overview of a deep-learning-based encoder method aiming at optimizing next generation hybrid video encoders for driving the block partitioning in intra slices. An encoding approach based on Convolutional Neural Networks is explored to partly substitute classical heuristics-based encoder speed-ups by a systematic and automatic process. The solution allows controlling the trade-off between complexity and coding gains, in intra slices, with one single parameter. This algorithm was proposed at the Call for Proposals of the Joint Video Exploration Team (JVET) on video compression with capability beyond HEVC. In All Intra configuration, for a given allowed topology of splits, a speed-up of ×2 is obtained without BD-rate loss, or a speed-up above ×4 with a loss below 1% in BD-rate.
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
This paper deals with video coding of static scenes viewed by a moving camera. We propose an automatic way to encode such video sequences using several 3D models. Contrary to prior art in model-based coding where 3D models have to be known, the 3D models are automatically computed from the original video sequence. We show that several independent 3D models provide the same functionalities as one single 3D model, and avoid some drawbacks of the previous approaches. To achieve this goal we propose a novel algorithm of sliding adjustment, which ensures consistency of successive 3D models. The paper presents a method to automatically extract the set of 3D models and associate camera positions. The obtained representation can be used for reconstructing the original sequence, or virtual ones. It also enables 3D functionalities such as synthetic object insertion, lightning modification, or stereoscopic visualization. Results on real video sequences are presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.