Deep perceptual preprocessing has recently emerged as a new way to enable further bitrate savings across several generations of video encoders without breaking standards or requiring any changes in client devices. In this paper, we lay the foundations toward a generalized psychovisual preprocessing framework for video encoding and describe one of its promising instantiations that is practically deployable for video-on-demand, live, gaming and user-generated content. Results using state-of-the-art AVC, HEVC and VVC encoders show that average bitrate (BD-rate) gains of 11% to 17% are obtained over three state-of-the-art reference-based quality metrics (Netflix VMAF, SSIM and Apple AVQT), as well as the recently-proposed non-reference ITU-T p.1204 metric. The proposed framework on CPU is shown to be twice faster than x264 mediumpreset encoding. On GPU hardware, our approach achieves 714fps for 1080p video (below 2ms/frame), thereby enabling its use in very-low latency live video or game streaming applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.