Many MPEG-2 encoding applications are realtime; this implies that the video signal must be encoded with no significant lookahead. However, there exist non-real-time applications that do enable us to first analyze a video sequence entirely, and, using the analysis results, to optimize a second encoding pass of the same data. One example of such an application is the digital video disk (DVD), which is designed to facilitate a variable-bit-rate (VBR) output stream. In that case, it is possible to let the MPEG-2 encoder produce a video sequence with a constant visual quality over time. This is in contrast to constant-bit-rate (CBR) systems, where the rate is constant but the visual quality varies with the coding difficulty. This paper describes a two-pass encoding system that has as its objective to produce an optimized VBR data stream in a second pass. In a first pass, the video sequence is encoded with CBR, while statistics concerning coding complexity are gathered. Next, the first-pass data is processed to prepare the control parameters for the second pass, which performs the actual VBR compression. In this off-line processing stage, we determine the target number of bits for each picture in the sequence, such that we realize the VBR objective. This means that the available bits are appropriately distributed over the different video segments such that constant visual quality is obtained. To be able to quantify the constant visual quality, perceptual experiments are described and a practical model is fitted to them. Exceptional cases such as scene changes and fades are detected and dealt with appropriately. We also ensure that the secondpass compression process does not violate the decoder buffer boundaries. Finally, the encoding is performed again, but now under control of the processed first-pass data. During the running of this second pass, a run-time bit-production control mechanism monitors the accuracy and validity of the firstpass data, correcting errors in prediction and observing the buffer boundaries. Results are compared to CBR operation.
Most real-time MPEG-2 encoders are designed to perform in a constant-bit-rate (CBR) mode, in which buffer constraints are imposed to circumvent large deviations from a desired rate at any instant in time. Although such streams are generally good-quality sequences, certain types of operations or environments call for a more efficient real-time CBR encoder. The first part of the paper describes how a better-quality CBR video stream can be produced by estimating the relative complexity of a picture in comparison with the average complexity of the partially encoded stream and using it to adjust the compression parameters in a single-pass mode of operation. Our CBR encoder is particularly attractive for digital broadcast and editing environments, in which representations of higher-fidelity video objects in both display and freeze modes are constantly pursued. The second part of the paper describes the real-time generation of video streams with a variable-bit-rate (VBR) encoder. This mode of operation is highly desirable for home entertainment and recreational events. We propose a robust single-pass VBR video encoder algorithm which is capable of learning and adapting itself to the complexity of image segments and thereafter creating streams which have constant visual picture quality. The new VBR scheme displays a better performance than the CBR encoder, particularly when special effects such as scene transitions, fades, or luminance changes are to be compressed. Both CBR and VBR encoders are fully compliant with the MPEG-2 standard and are easily implementable with IBM encoder architecture. Compression results for the new single-pass encoding algorithms and comparisons with previous CBR schemes are provided. The result suggests the suitability of our VBR approach for record/playback in storage media such as digital video disc (DVD) players, disk-based camcorders, and digital videocassette recorders (DVCRs). It further reflects the importance of our single-pass CBR scheme for providers of broadcast services, for which it allows more video programs to be allocated to a selected communication link, and for in-studio applications, for which it greatly facilitates visual analysis of captured streams.
Temporal relationships (motion fields) have been widely exploited by researchers for video processing. Their primary use has been to group pixels in spatiotemporal neighborhoods. Examples include coding and noise reduction. Typically, video processing is achieved by filtering, modeling, or analyzing pixels in these neighborhoods. In spite of the widespread use of motion information to process video, rarely are the fields treated as signals, i.e., the temporal relationships are seldom considered as a distinct time series. A notable exception is the generalized autoregressive modeling of these relationships in Rajagopalan et al. (1997). In this work, we present a generalization of finite impulse response filtering applicable to temporal relationships and continue the spirit of the work of treating motion fields as a distinct signal (albeit one that is closely tied to the pixel intensities). Applications presented are preprocessing of video for coding and for noise reduction. Instead of filtering pixels in spatiotemporal neighborhoods directly, we argue that it may be more beneficial to filter the temporal relationships first and then synthesize processed video. Simulations shows MPEG-1 rate gains of up to 20% for coding processed video compared to unprocessed ones where processing leaves the original perceptually unchanged. Noise reduction experiments demonstrate a gain of 0.5 dB at high signal to noise ratios over the best results in the published literature while at low to moderate SNRs, improvements are 0.3 dB lower.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.