2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.176
|View full text |Cite
|
Sign up to set email alerts
|

Learned Multi-patch Similarity

Abstract: Estimating a depth map from multiple views of a scene is a fundamental task in computer vision. As soon as more than two viewpoints are available, one faces the very basic question how to measure similarity across >2 image patches. Surprisingly, no direct solution exists, instead it is common to fall back to more or less robust averaging of two-view similarities. Encouraged by the success of machine learning, and in particular convolutional neural networks, we propose to learn a matching function which directl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
91
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 107 publications
(93 citation statements)
references
References 26 publications
(43 reference statements)
2
91
0
Order By: Relevance
“…Here, we have tested straight-forward, handcrafted averaging and voting schemes. It may however be interesting to also learn the combination, or even to explore an "early combination" where a multi-way similarity [20] is computed from a test example to a set of multiple exemplars.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Here, we have tested straight-forward, handcrafted averaging and voting schemes. It may however be interesting to also learn the combination, or even to explore an "early combination" where a multi-way similarity [20] is computed from a test example to a set of multiple exemplars.…”
Section: Resultsmentioning
confidence: 99%
“…After the advent of modern convolutional networks, the same idea was applied to raw images, e.g., [12,13,14,15,16]. Siamese convolutional branches independently transform two (or more) images A and B into high-level representations that are then merged and transformed further into a learned measure F (A, B) of similarity.…”
Section: Related Workmentioning
confidence: 99%
“…The simplest representation for 3D reconstruction from one or more images are 2.5D depth maps as they can be inferred using standard 2D convolutional neural networks [14,18,24,43]. Since depth maps are view-based, these methods require additional post-processing algorithms to fuse information from multiple viewpoints in order to capture the entire object geometry.…”
Section: D Reconstructionmentioning
confidence: 99%
“…This allows us to state the following: if the first primitive exists, the first primitive will be the one closest to point x i of the target point, if the first primitive does not exist and the second does, then the second primitive is closest to point x i and so forth. More formally, this property can be stated A 3D vector r(η, ω) defines a closed surface in space as η (latitude angle) and ω (longitude angle) change in the given intervals (14). The rigid body transformation T m (x) maps a point from the world coordinate system to the local coordinate system of the m th primitive.…”
Section: B Derivation Of Pointcloud-to-primitive Lossmentioning
confidence: 99%
See 1 more Smart Citation