2017
DOI: 10.5194/isprs-annals-iv-2-w3-67-2017
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Depth From Motion With Stabilized Monocular Videos

Abstract: ABSTRACT:We propose a depth map inference system from monocular videos based on a novel dataset for navigation that mimics aerial footage from gimbal stabilized monocular camera in rigid scenes. Unlike most navigation datasets, the lack of rotation implies an easier structure from motion problem which can be leveraged for different kinds of tasks such as depth inference and obstacle avoidance. We also propose an architecture for end-to-end depth inference with a fully convolutional network. Results show that a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 37 publications
(40 reference statements)
0
8
0
Order By: Relevance
“…In practice, these labels are expensive to obtain and, thus, limit the data quantity and thereby the application of deep learning methods. To cope with the given data limitations, one possibility is to generate artificial datasets [10,19], but the transfer from synthetic datasets to reality is still accompanied by a significant decrease in performance.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In practice, these labels are expensive to obtain and, thus, limit the data quantity and thereby the application of deep learning methods. To cope with the given data limitations, one possibility is to generate artificial datasets [10,19], but the transfer from synthetic datasets to reality is still accompanied by a significant decrease in performance.…”
Section: Related Workmentioning
confidence: 99%
“…Their central aim was, similar to ours, to establish a depth network that can incorporate structure from motion in its prediction, instead of relying only on structure from scene geometry as the single-frame approaches do. In practice, [9] beat the basic SfML framework only on the artificial StillBox dataset [10] which is showing random shapes and textures in a 3D space. Despite a superior performance on StillBox, the results on the autonomous driving benchmark KITTI [24] were similar and in part worse than the baseline architecture.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Our network is broadly inspired from FlowNetS [3] (initially used for flow inference) and called DepthNet. It is described in details in [18], we provide here a summary of its structure (Fig 3) and performances. Each convolution (apart from depth modules) is followed by a Spatial Batch Normalization and ReLU activation layer.…”
Section: Depth Inference Trainingmentioning
confidence: 99%
“…No preprocessing such as optical flow computation, nor visual odometry is applied to the input, while the depth is directly provided as an output. [18] a Parrot, Paris, France (clement.pinard, laure.chevalley)@parrot.com b U2IS, ENSTA ParisTech, Université Paris-Saclay, Palaiseau, France (clement.pinard, antoine.manzanera, david.filliat)@ensta-paristech.fr We created a dataset of image pairs with random translation movements, with no rotation, and a constant displacement magnitude applied during the whole training.…”
Section: Introductionmentioning
confidence: 99%