High efficiency video coding (HEVC) is the latest video coding standard that has been developed by JCT-VC. It employs plenty of efficient coding algorithms (e.g., highly flexible quad-tree coding block partitioning), and outperforms H.264/AVC by 35-43% bitrate reduction. However, it imposes enormous computational complexity on encoder due to the optimization processing in the efficient coding tools, especially the rate distortion optimization on coding unit (CU), prediction unit, and transform unit. In this article, we propose a CU splitting early termination algorithm to reduce the heavy computational burden on encoder. CU splitting is modeled as a binary classification problem, on which a support vector machine (SVM) is applied. In order to reduce the impact of outliers as well as to maintain the RD performance while a misclassification occurs, RD loss due to misclassification is introduced as weights in SVM training. Efficient and representative features are extracted and optimized by a wrapper approach to eliminate dependency on video content as well as on encoding configurations. Experimental results show that the proposed algorithm can achieve about 44.7% complexity reduction on average with only 1.35% BD-rate increase under the "random access" configuration, and 41.9% time saving with 1.66% BD-rate increase under the "low delay" setting, compared with the HEVC reference software.
Conventional 2-D Just-Noticeable-Difference (JND) models measure the perceptible distortion of visual signal based on monocular vision properties by presenting a single image for both eyes. However, they are not applicable for stereoscopic displays in which a pair of stereoscopic images is presented to a viewer's left and right eyes, respectively. Some unique binocular vision properties, e.g., binocular combination and rivalry, need to be considered in the development of a JND model for stereoscopic images. In this letter, we propose a binocular JND (BJND) model based on psychophysical experiments which are conducted to model the basic binocular vision properties in response to asymmetric noises in a pair of stereoscopic images. The first experiment exploits the joint visibility thresholds according to the luminance masking effect and the binocular combination of noises. The second experiment examines the reduction of visual sensitivity in binocular vision due to the contrast masking effect. Based on these experiments, the developed BJND model measures the perceptible distortion of binocular vision for stereoscopic images. Subjective evaluations on stereoscopic images validate of the proposed BJND model.
Currently, 3-D Video targets at the application of disparity-adjustable stereoscopic video, where view synthesis based on depth-image-based rendering (DIBR) is employed to generate virtual views. Distortions in depth information may introduce geometry changes or occlusion variations in the synthesized views. In practice, depth information is stored in 8-bit grayscale format, whereas the disparity range for a visually comfortable stereo pair is usually much less than 256 levels. Thus, several depth levels may correspond to the same integer (or sub-pixel) disparity value in the DIBR-based view synthesis such that some depth distortions may not result in geometry changes in the synthesized view. From this observation, we develop a depth no-synthesis-error (D-NOSE) model to examine the allowable depth distortions in rendering a virtual view without introducing any geometry changes. We further show that the depth distortions prescribed by the proposed D-NOSE profile also do not compromise the occlusion order in view synthesis. Therefore, a virtual view can be synthesized losslessly if depth distortions follow the D-NOSE specified thresholds. Our simulations validate the proposed D-NOSE model in lossless view synthesis and demonstrate the gain with the model in depth coding.
The blind quality evaluation of screen content images (SCIs) and natural scene images (NSIs) has become an important, yet very challenging issue. In this paper, we present an effective blind quality evaluation technique for SCIs and NSIs based on a dictionary of learned local and global quality features. First, a local dictionary is constructed using local normalized image patches and conventional -means clustering. With this local dictionary, the learned local quality features can be obtained using a locality-constrained linear coding with max pooling. To extract the learned global quality features, the histogram representations of binary patterns are concatenated to form a global dictionary. The collaborative representation algorithm is used to efficiently code the learned global quality features of the distorted images using this dictionary. Finally, kernel-based support vector regression is used to integrate these features into an overall quality score. Extensive experiments involving the proposed evaluation technique demonstrate that in comparison with most relevant metrics, the proposed blind metric yields significantly higher consistency in line with subjective fidelity ratings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.