Recent methods for boundary or edge detection built on Deep Convolutional Neural Networks (CNNs) typically suffer from the issue of predicted edges being thick and need post-processing to obtain crisp boundaries. Highly imbalanced categories of boundary versus background in training data is one of main reasons for the above problem. In this work, the aim is to make CNNs produce sharp boundaries without post-processing. We introduce a novel loss for boundary detection, which is very effective for classifying imbalanced data and allows CNNs to produce crisp boundaries. Moreover, we propose an end-to-end network which adopts the bottom-up/top-down architecture to tackle the task. The proposed network effectively leverages hierarchical features and produces pixel-accurate boundary mask, which is critical to reconstruct the edge map. Our experiments illustrate that directly making crisp prediction not only promotes the visual results of CNNs, but also achieves better results against the state-of-the-art on the BSDS500 dataset (ODS F-score of .815) and the NYU Depth dataset (ODS F-score of .762).
Object contour detection is the fundamental and preprocessing step for multimedia applications such as icon generation, object segmentation, and tracking. The quality of contour prediction is of great importance in these applications since it affects the subsequent process. In this work, we aim to develop a high-performance contour detection system. We first propose a novel yet very effective loss function for contour detection. The proposed loss function is capable of penalizing the distance of contour-structure similarity between each pair of prediction and ground-truth. Moreover, to better distinguishing object contours and background textures, we introduce a novel convolutional encoder-decoder network. Within the network, we present a hyper module that captures dense connections among high-level features and produces effective semantic information. Then the information is progressively propagated and fused with low-level features. We conduct extensive experiments on the BSDS500 and Multi-Cue datasets, the results show significant improvement against the state-of-the-art competitors. We further demonstrate the benefit of our DSCD method for crowd counting.
Whale optimization algorithm (WOA) tends to fall into the local optimum and fails to converge quickly in solving complex problems. To address the shortcomings, an improved WOA (QGBWOA) is proposed in this work. First, quasi-opposition-based learning is introduced to enhance the ability of WOA to search for optimal solutions. Second, a Gaussian barebone mechanism is embedded to promote diversity and expand the scope of the solution space in WOA. To verify the advantages of QGBWOA, comparison experiments between QGBWOA and its comparison peers were carried out on CEC 2014 with dimensions 10, 30, 50, and 100 and on CEC 2020 test with dimension 30. Furthermore, the performance results were tested using Wilcoxon signed-rank (WS), Friedman test, and post hoc statistical tests for statistical analysis. Convergence accuracy and speed are remarkably improved, as shown by experimental results. Finally, feature selection and multi-threshold image segmentation applications are demonstrated to validate the ability of QGBWOA to solve complex real-world problems. QGBWOA proves its superiority over compared algorithms in feature selection and multi-threshold image segmentation by performing several evaluation metrics. Supplementary Information The online version contains supplementary material available at 10.1007/s42235-022-00297-8.
No abstract
We study the problem of estimating the relative depth order of point pairs in a monocular image. Recent advances [1], [2] mainly focus on using deep convolutional neural networks (DCNNs) to learn and infer the ordinal information from multiple contextual information of the points pair such as global scene context, local contextual information, and the locations. However, it remains unclear how much each context contributes to the task. To address this, we first examine the contribution of each context cue [1], [2] to the performance in the context of depth order estimation. We find out the local context surrounding the points pair contributes the most and the global scene context helps little. Based on the findings, we propose a simple method, using a multi-scale densely-connected network to tackle the task. Instead of learning the global structure, we dedicate to explore the local structure by learning to regress from regions of multiple sizes around the point pairs. Moreover, we use the recent densely connected network [3] to encourage substantial feature reuse as well as deepen our network to boost the performance. We show in experiments that the results of our approach is on par with or better than the state-of-the-art methods with the benefit of using only a small number of training data.
Contour detection plays an important role in both academic research and real-world applications. As the basic building block of many applications, its accuracy and efficiency highly influence the subsequent stages. In this work, we propose a novel lightweight system for contour detection that achieves state-of-the-art performance while keeps ultra-slim model size. The proposed method is built on an efficient encoder in a bottom-up/top-down fashion. Specially, we propose a novel decoder which compresses side features from an encoder and effectively decodes compact contextual information for high-accurate boundary localization. Besides, we propose a novel loss function that is able to assist a model to produce crisp object boundaries.We conduct extensive experiments to demonstrate the effectiveness of the proposed system on the widely adopted benchmarks BSDS500 and Multi-Cue. The results show that our system achieves the same best performance , yet only consumes 3.3% computational cost (16.45GFlops VS. 499.15GFlops) and 2.35% model size (1.94M VS. 82.43M) of the SOTA detector RCF-ResNet101. In the meantime, our method outperforms a large portion of the recent top edge detectors by a clear margin. CCS CONCEPTS• Computing methodologies → Scene understanding; Image segmentation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.