LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation

Kim, Sunok; Kim, Seungryong; Min, Dongbo; Sohn, Kwanghoon

doi:10.1109/cvpr.2019.00029

Cited by 54 publications

(40 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…With the convincing results on the Middlebury dataset, LGC+, like almost all learned confidence estimation procedures, proves to be relatively insensitive to differences between the training and test domains. Overall, these results underline the findings of Kim et al (2019) and Kim et al (2020) regarding the higher accuracy of multi-modal approaches. This is due to complementary information supporting the task of confidence estimation in a broader range of failure cases than a single or bi-modal input.…”

Section: Complete Modelsupporting

confidence: 76%

“…Nevertheless, essential issues of CVA arise from the high computational cost of 3D convolutions, the small receptive field, and the exclusive consideration of a single modality, potentially neglecting other valuable information. Kim et al (2019) and Kim et al (2020), on the other hand, consider the cost volume in a multi-modal approach, further extending the quantity of modalities used. For this purpose, features from RGB images, disparity maps and cost volumes are combined, forming a tri-modal input.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Learning Multi-Modal Features for Dense Matching-Based Confidence Estimation

Heinrich

Mehltretter

2021

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

Abstract. In recent years, the ability to assess the uncertainty of depth estimates in the context of dense stereo matching has received increased attention due to its potential to detect erroneous estimates. Especially, the introduction of deep learning approaches greatly improved general performance, with feature extraction from multiple modalities proving to be highly advantageous due to the unique and different characteristics of each modality. However, most work in the literature focuses on using only mono- or bi- or rarely tri-modal input, not considering the potential effectiveness of modalities, going beyond tri-modality. To further advance the idea of combining different types of features for confidence estimation, in this work, a CNN-based approach is proposed, exploiting uncertainty cues from up to four modalities. For this purpose, a state-of-the-art local-global approach is used as baseline and extended accordingly. Additionally, a novel disparity-based modality named warped difference is presented to support uncertainty estimation at common failure cases of dense stereo matching. The general validity and improved performance of the proposed approach is demonstrated and compared against the bi-modal baseline in an evaluation on three datasets using two common dense stereo matching techniques.

show abstract

Section: Complete Modelsupporting

confidence: 76%

Section: Related Workmentioning

confidence: 99%

Learning Multi-Modal Features for Dense Matching-Based Confidence Estimation

Heinrich

Mehltretter

2021

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

show abstract

“…With the rise of deep learning, methods based on convolutional neural networks have been developed. Some of these CNN approaches focused on the disparity map learning (Poggi et al, 2017), other methods worked directly with the cost volume in order to take more information into account (Mehltretter and Heipke, 2019;Kim et al, 2019).…”

Section: Related Workmentioning

confidence: 99%

Ambiguity Concept in Stereo Matching Pipeline

Sarrazin

Cournet

Dumas

et al. 2021

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

Abstract. In a 3D reconstruction pipeline, stereo matching step aims at computing a disparity map representing the depth between image pair. The evaluation of the disparity map can be done through the estimation of a confidence metric. In this article, we propose a new confidence metric, named ambiguity integral metric, to assess the quality of the produced disparity map. This metric is derived from the concept of ambiguity, which characterizes the property of the cost curve profile. It aims to quantify the difficulty in identifying the correct disparity to select. The quality of ambiguity integral metric is evaluated through the ROC curve methodology and compared with other confidence measures. In regards to other measures, the ambiguity integral measure shows a good potential. We also integrate this measure through various steps of the stereo matching pipeline in order to improve the performance estimation of the disparity map. First, we include ambiguity integral measure during the Semi Global Matching optimization step. The objective is to weight, by ambiguity integral measure, the influence of points in the SGM regularization to reduce the impact of ambiguous points. Secondly, we use ambiguity as an input of a disparity refinement deep learning architecture in order to easily locate noisy area and preserve details.

show abstract

“…[26] proposed a novel method that combines the contextual information with multiviewpoint depth images to construct multiviewpoint context-aware representation for scene classification. Kim et al [27] exploited tri-modal information to produce confidence of the disparity for stereo confidence estimation. However, these methods neglect the spatial relationship among features.…”

Section: B Feature Fusionmentioning

confidence: 99%

Self-Layer and Cross-Layer Bilinear Aggregation for Fine-Grained Recognition in Cyber-Physical-Social Systems

et al. 2020

View full text Add to dashboard Cite

Cyber-Physical-Social Systems (CPSS) integrates cyber, physical and social spaces together, which makes our lives more convenient and intelligent by providing personalized service. In this paper, we will provide CPSS service for fine-grained recognition. Fine-grained visual recognition is a hot but challenging research in computer vision that aims to recognize object subcategories. The reason why it is challenging is that it extremely depends on the subtle discriminative features of local parts. Recently, some bilinear feature based methods were proposed, and the experimental results show state-of-the-art performance. However, most of them neglect the spatial relationships of part-region feature among multiple layers. In this paper, a novel approach of Self-layer and Cross-layer Bilinear Aggregation(SCBA) is proposed for fine-grained recognition. Firstly, a self-layer bilinear feature fusion module is proposed to model the spatial relationship of feature at the same layer. Secondly, we propose a cross-layer bilinear feature fusion module to capture the inter-layer interreaction of information to boost the ability of feature representation. In summary, the method we proposed not only can learn the correlations among different layers but the same layer, which makes it efficient and the experimental results show that it achieves state-of-the-art accuracy on three common fine-grained image datasets.

show abstract

LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation

Cited by 54 publications

References 31 publications

Learning Multi-Modal Features for Dense Matching-Based Confidence Estimation

Learning Multi-Modal Features for Dense Matching-Based Confidence Estimation

Ambiguity Concept in Stereo Matching Pipeline

Self-Layer and Cross-Layer Bilinear Aggregation for Fine-Grained Recognition in Cyber-Physical-Social Systems

Contact Info

Product

Resources

About