The blind quality evaluation of screen content images (SCIs) and natural scene images (NSIs) has become an important, yet very challenging issue. In this paper, we present an effective blind quality evaluation technique for SCIs and NSIs based on a dictionary of learned local and global quality features. First, a local dictionary is constructed using local normalized image patches and conventional -means clustering. With this local dictionary, the learned local quality features can be obtained using a locality-constrained linear coding with max pooling. To extract the learned global quality features, the histogram representations of binary patterns are concatenated to form a global dictionary. The collaborative representation algorithm is used to efficiently code the learned global quality features of the distorted images using this dictionary. Finally, kernel-based support vector regression is used to integrate these features into an overall quality score. Extensive experiments involving the proposed evaluation technique demonstrate that in comparison with most relevant metrics, the proposed blind metric yields significantly higher consistency in line with subjective fidelity ratings.
RGB–thermal scene parsing has recently attracted increasing research interest in the field of computer vision. However, most existing methods fail to perform good boundary extraction for prediction maps and cannot fully use high-level features. In addition, these methods simply fuse the features from RGB and thermal modalities but are unable to obtain comprehensive fused features. To address these problems, we propose an edge-aware guidance fusion network (EGFNet) for RGB–thermal scene parsing. First, we introduce a prior edge map generated using the RGB and thermal images to capture detailed information in the prediction map and then embed the prior edge information in the feature maps. To effectively fuse the RGB and thermal information, we propose a multimodal fusion module that guarantees adequate cross-modal fusion. Considering the importance of high-level semantic information, we propose a global information module and a semantic information module to extract rich semantic information from the high-level features. For decoding, we use simple elementwise addition for cascaded feature fusion. Finally, to improve the parsing accuracy, we apply multitask deep supervision to the semantic and boundary maps. Extensive experiments were performed on benchmark datasets to demonstrate the effectiveness of the proposed EGFNet and its superior performance compared with state-of-the-art methods. The code and results can be found at https://github.com/ShaohuaDong2021/EGFNet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.