In this paper we propose an approach to jointly estimate the layout of rooms as well as the clutter present in the scene using RGB-D data. Towards this goal, we propose an effective model that is able to exploit both depth and appearance features, which are complementary. Furthermore, our approach is efficient as we exploit the inherent decomposition of additive potentials. We demonstrate the effectiveness of our approach on the challenging NYU v2 dataset and show that employing depth reduces the layout error by 6% and the clutter estimation by 13%.
Deep neural networks have demonstrated advanced abilities on various visual classification tasks, which heavily rely on the large-scale training samples with annotated ground-truth. However, it is unrealistic always to require such annotation in real-world applications. Recently, Few-Shot learning (FS), as an attempt to address the shortage of training samples, has made significant progress in generic classification tasks. Nonetheless, it is still challenging for current FS models to distinguish the subtle differences between fine-grained categories given limited training data. To filling the classification gap, in this paper, we address the Few-Shot Fine-Grained (FSFG) classification problem, which focuses on tackling the fine-grained classification under the challenging few-shot learning setting. A novel low-rank pairwise bilinear pooling operation is proposed to capture the nuanced differences between the support and query images for learning an effective distance metric. Moreover, a feature alignment layer is designed to match the support image features with query ones before the comparison. We name the proposed model Low-Rank Pairwise Alignment Bilinear Network (LRPABN), which is trained in an end-to-end fashion. Comprehensive experimental results on four widely used fine-grained classification datasets demonstrate that our LRPABN model achieves the superior performances compared to state-of-the-art methods.
To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect.
Fuzzy c-means (FCM) is one of the best-known clustering methods to organize the wide variety of datasets automatically and acquire accurate classification, but it has a tendency to fall into local minima. For overcoming these weaknesses, some methods that hybridize PSO and FCM for clustering have been proposed in the literature, and it is demonstrated that these hybrid methods have an improved accuracy over traditional partition clustering approaches, whereas PSO-based clustering methods have poor execution time in comparison to partitional clustering techniques, and the current PSO algorithms require tuning a range of parameters before they are able to find good solutions. Therefore, this paper introduces a hybrid method for fuzzy clustering, named FCM-ELPSO, which aim to deal with these shortcomings. It combines FCM with an improved version of PSO, called ELPSO, which adopts a new enhanced logarithmic inertia weight strategy to provide better balance between exploration and exploitation. This new hybrid method uses PBM(F) index and the objective function value as cluster validity indexes to evaluate the clustering effect. To verify the effectiveness of the algorithm, two types of experiments are performed, including PSO clustering and hybrid clustering. Experiments show that the proposed approach significantly improves convergence speed and the clustering effect.
Object instance segmentation is one of the most fundamental but challenging tasks in computer vision, and it requires the pixel-level image understanding. Most existing approaches address this problem by adding a mask prediction branch to a two-stage object detector with the Region Proposal Network (RPN). Although producing good segmentation results, the efficiency of these two-stage approaches is far from satisfactory, restricting their applicability in practice. In this paper, we propose a one-stage framework, SPRNet, which performs efficient instance segmentation by introducing a single pixel reconstruction (SPR) branch to off-the-shelf one-stage detectors. The added SPR branch reconstructs the pixel-level mask from every single pixel in the convolution feature map directly. Using the same ResNet-50 backbone, SPRNet achieves comparable mask AP to Mask R-CNN at a higher inference speed, and gains all-round improvements on box AP at every scale comparing with RetinaNet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.