2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.00122
|View full text |Cite
|
Sign up to set email alerts
|

Audio-Visual Floorplan Reconstruction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(15 citation statements)
references
References 36 publications
0
15
0
Order By: Relevance
“…Audio-visual scene mapping. To our knowledge, the only prior work to translate audio-visual inputs into a general (arbitrarily shaped) floorplan maps is AV-Floorplan [59]. Unlike AV-Floorplan, our method maps from speech in natural human conversations, which avoids emitting intrusive frequency sweep signals to generate echoes.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Audio-visual scene mapping. To our knowledge, the only prior work to translate audio-visual inputs into a general (arbitrarily shaped) floorplan maps is AV-Floorplan [59]. Unlike AV-Floorplan, our method maps from speech in natural human conversations, which avoids emitting intrusive frequency sweep signals to generate echoes.…”
Section: Related Workmentioning
confidence: 99%
“…ping (e.g., visual SLAM) are highly effective when extensive exposure to the environment is possible, in many real-world scenarios only a fraction of the space is observed by the camera. Recent work shows the promise of sensing 3D spaces with both sight and sound [8,14,26,28,59]: listening to echoes bounce around the room can reveal the depth and shape of surrounding surfaces, and even help extrapolate a floorplan beyond the camera's field of view or behind occluded objects [59]. While we are inspired by these advances, they also have certain limitations.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Parida et al [ 7 ] estimated depth maps using multi-modal data (RGB images, echoes, and materials of objects) from indoor scenes. Purushwalkam et al [ 23 ] reconstructed the floor plan of the invisible area using echoes. Batvision [ 5 ] used both vision and echoes to train, and in the test phase, they estimated depth using echoes only.…”
Section: Related Workmentioning
confidence: 99%
“…Sound Simulation using Machine Learning: Many recent deep learning methods have been proposed for sound synthesis [Hawley et al 2020;Ji et al 2020;Jin et al 2020], scattering effect computation, and sound propagation [Fan et al 2020;Pulkki and Svensson 2019;. Deep learning methods have also been used to compute material properties of a room and acoustic characteristics [Schissler et al 2017;Tang et al 2020a] Other applications that have used acoustic datasets include navigation , floorplan reconstruction [Purushwalkam et al 2021] and depth estimation algorithms [Gao et al 2020].…”
Section: Introductionmentioning
confidence: 99%