2021
DOI: 10.48550/arxiv.2109.02763
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Binaural SoundNet: Predicting Semantics, Depth and Motion with Binaural Sounds

Abstract: Humans can robustly recognize and localize objects by using visual and/or auditory cues. While machines are able to do the same with visual data already, less work has been done with sounds. This work develops an approach for scene understanding purely based on binaural sounds. The considered tasks include predicting the semantic masks of sound-making objects, the motion of sound-making objects, and the depth map of the scene. To this aim, we propose a novel sensor setup and record a new audio-visual dataset o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 61 publications
(107 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?