ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414042
|View full text |Cite
|
Sign up to set email alerts
|

On the Predictability of Hrtfs from Ear Shapes Using Deep Networks

Abstract: Head-Related Transfer Function (HRTF) individualization is critical for immersive and realistic spatial audio rendering in augmented/virtual reality. Neither measurements nor simulations using 3D scans of head/ear are scalable for practical applications. More efficient machine learning approaches are being explored recently, to predict HRTFs from ear images or anthropometric features. However, it is not yet clear whether such models can provide an alternative for direct measurements or high-fidelity simulation… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…The first architecture used is a modified version of CNN-Reg [2], representing a CNN with skip connections. This architecture features four consecutive stages and was originally applied to predict the single-frequency gammatone-filtered HRTF log-amplitude spectrum jointly for 360 directions from voxelised ear meshes.…”
Section: Network Architecturesmentioning
confidence: 99%
See 1 more Smart Citation
“…The first architecture used is a modified version of CNN-Reg [2], representing a CNN with skip connections. This architecture features four consecutive stages and was originally applied to predict the single-frequency gammatone-filtered HRTF log-amplitude spectrum jointly for 360 directions from voxelised ear meshes.…”
Section: Network Architecturesmentioning
confidence: 99%
“…Personalised audio is more than ever in demand both by industrial and academic research to achieve a high level of plausibility when binaurally reproducing virtual acoustic environments. Taking advantage of the rapid process in the field of machine learning, deep neural networks (DNNs) have been extensively applied for the personalisation of HRTFs [1,2]. Such DNNs can, for example, be trained to directly predict the log-magnitude spectrum of HRTFs from edge-detected single-view ear images and individual anthropometric features [3].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, data-driven methods for binaural synthesis have been widely studied [16,17,18,19]. These methods inherently require a sufficiently large number of data points to estimate the distribution of interest.…”
Section: Introductionmentioning
confidence: 99%