Capsule network is a novel architecture to encode the properties and spatial relationships of the feature in the images, which shows encouraging results on image classification. However, the original capsule network is not suitable for some classification tasks that the detected object has complex internal representations. Hence, we propose Multi-Scale Capsule Network, a novel variation of capsule network to enhance the computational efficiency and representation capacity of capsule network. The proposed Multi-Scale Capsule Network consists of two stages. In the first stage the structural and semantic information are obtained by the multi-scale feature extraction. The second stage, we encode the hierarchy of features to multi-dimensional primary capsule. Moreover, we propose an improved dropout to enhance the robustness of capsule network. Experimental results show that our method has competitive performance on FashionMNIST and CIFAR10 datasets.
Benefiting from multi-view video plus depth and depth-image-based-rendering technologies, only limited views of a real 3-D scene need to be captured, compressed, and transmitted. However, the quality assessment of synthesized views is very challenging, since some new types of distortions, which are inherently different from the texture coding errors, are inevitably produced by view synthesis and depth map compression, and the corresponding original views (reference views) are usually not available. Thus the full-reference quality metrics cannot be used for synthesized views. In this paper, we propose a novel no-reference image quality assessment method for 3-D synthesized views (called NIQSV+). This blind metric can evaluate the quality of synthesized views by measuring the typical synthesis distortions: blurry regions, black holes, and stretching, with access to neither the reference image nor the depth map. To evaluate the performance of the proposed method, we compare it with four full-reference 3-D (synthesized view dedicated) metrics, five full-reference 2-D metrics, and three no-reference 2-D metrics. In terms of their correlations with subjective scores, our experimental results show that the proposed no-reference metric approaches the best of the state-of-the-art full reference and no-reference 3-D metrics; and outperforms the widely used no-reference and full-reference 2-D metrics significantly. In terms of its approximation of human ranking, the proposed metric achieves the best performance in the experimental test.
Abstract. One application of Association Rule Mining (ARM) is to identify Classification Association Rules (CARs) that can be used to classify future instances from the same population as the data being mined. Most CARM methods first mine the data for candidate rules, then prune these using coverage analysis of the training data. In this paper we describe a CARM algorithm that avoids the need for coverage analysis, and a technique for tuning its threshold parameters to obtain more accurate classification. We present results to show this approach can achieve better accuracy than comparable alternatives at lower cost.
International audiencePerceptual image quality assessment (IQA) uses a computational model to assess the image quality in a fashion consistent with human opinions. A good IQA model should consider both the effectiveness and efficiency. To meet this need, a new model called multiscale contrast similarity deviation (MCSD) is developed in this paper. Contrast is a distinctive visual attribute closely related to the quality of an image. To further explore the contrast features, we resort to the multiscale representation. Although the contrast and the multiscale representation have already been used by other IQA indices, few have reached the goals of effectiveness and efficiency simultaneously. We compared our method with other state-of-the-art methods using six well-known databases. The experimental results showed that the proposed method yielded the best performance in terms of correlation with human judgments. Furthermore, it is also efficient when compared with other competing IQA models
Depth-Image-Based Rendering (DIBR) is a fundamental technology in several 3D-related applications, such as Free viewpoint video (FVV), Virtual Reality (VR) and Augmented Reality (AR). However, new challenges have also been brought in assessing the quality of DIBR-synthesized views since this process induces some new types of distortions, which are inherently different from the distortion caused by video coding. In this paper, we present a new DIBR-synthesized image database with the associated subjective scores. We also test the performances of the state-of-the-art objective quality metrics on this database. This work focuses on the distortions only induced by different DIBR synthesis methods. Seven state-of-the-art DIBR algorithms, including interview synthesis and single view based synthesis methods, are considered in this database. The quality of synthesized views was assessed subjectively by 41 observers and objectively using 14 state-of-the-art objective metrics. Subjective test results show that the interview synthesis methods, having more input information, significantly outperform the single view based ones. Correlation results between the tested objective metrics and the subjective scores on this database reveal that further studies are still needed for a better objective quality metric dedicated to the DIBR-synthesized views.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.