Deep perceptual preprocessing has recently emerged as a new way to enable further bitrate savings across several generations of video encoders without breaking standards or requiring any changes in client devices. In this paper, we lay the foundations toward a generalized psychovisual preprocessing framework for video encoding and describe one of its promising instantiations that is practically deployable for video-on-demand, live, gaming and user-generated content. Results using state-of-the-art AVC, HEVC and VVC encoders show that average bitrate (BD-rate) gains of 11% to 17% are obtained over three state-of-the-art reference-based quality metrics (Netflix VMAF, SSIM and Apple AVQT), as well as the recently-proposed non-reference ITU-T p.1204 metric. The proposed framework on CPU is shown to be twice faster than x264 mediumpreset encoding. On GPU hardware, our approach achieves 714fps for 1080p video (below 2ms/frame), thereby enabling its use in very-low latency live video or game streaming applications.