The rate-distortion optimization (RDO) framework for video coding achieves a tradeoff between bit-rate and quality. However, objective distortion metrics such as mean squared error traditionally used in this framework are poorly correlated with perceptual quality. We address this issue by proposing an approach that incorporates the structural similarity index as a quality metric into the framework. In particular, we develop a predictive Lagrange multiplier estimation method to resolve the chicken and egg dilemma of perceptual-based RDO and apply it to H.264 intra and inter mode decision. Given a perceptual quality level, the resulting video encoder achieves on the average 9% bit-rate reduction for intra-frame coding and 11% for inter-frame coding over the JM reference software. Subjective test further confirms that, at the same bit-rate, the proposed perceptual RDO indeed preserves image details and prevents block artifact better than traditional RDO.Index Terms-H.264, Lagrange multiplier, perceptual quality, rate-distortion optimization, structural similarity index, video codec.
The framework of rate-distortion optimization (RDO) has been widely adopted for video coding to achieve a good trade-off between bit-rate and distortion. However, objective distortion metrics such as mean square error traditionally used in this framework are poorly correlated with perceptual video quality. To address this issue, we incorporate the structural similarity index as a quality metric into the framework and develop a predictive Lagrange multiplier selection technique to resolve the chicken-and-egg dilemma of perceptual-based RDO. The resulting perceptual-based RDO is then applied to H.264 intra mode decision as an illustration of the application of the proposed technique. Given a perceptual quality level, 5%-10% bit rate reduction over the JM reference software of H.264 is achieved. Subjective evaluation further confirms that, at the same bit-rate, the proposed perceptual RDO preserves image details and prevents block artifact better than the traditional RDO.
The spatial-domain intra prediction scheme of H.264 has high computational complexity, especially for the High Profile as it incorporates the additional intra 8×8 prediction mode. To address this issue, we explore the hierarchy of H.264 mode decision process in this paper and adopt an approach that is in synchrony with the mode decision hierarchy. In particular, we propose a variance-based algorithm for block size decision, an improved filter-based algorithm for prediction mode decision using contextual information, and a selection algorithm for intra block decision that exploits the relation between the ratedistortion characteristic and the best coding type. Performance comparison is provided to show the improvement of the proposed algorithms over previous methods.Index Terms-H.264/advanced video coding (AVC), intra prediction, rate-distortion optimization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.