2016
DOI: 10.1016/j.apacoust.2016.03.016
|View full text |Cite
|
Sign up to set email alerts
|

Stereophonic channel decorrelation using a binaural masking model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 34 publications
0
6
0
Order By: Relevance
“…The RIRs are generated using the image method [23]. Three sizes of near-end and far-end rooms are selected, which are 4, 3, 3 m, [6,4,3] m, and [8,7,3] m. The Reverberation time is set to be 0.3 s, 0.6 s, and 0.9 s. The distance between loudspeakers microphones are set to be 2.0 m and 0.4 m, respectively. The distance between each speaker position and the center of microphones is set to be [0.3, 0.7, 1.1]0.7 m. The near-end speech is mixed with the echo signals at a signal-to-echo ratio (SER) randomly chosen from [0, 5, 10, 15] dB.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The RIRs are generated using the image method [23]. Three sizes of near-end and far-end rooms are selected, which are 4, 3, 3 m, [6,4,3] m, and [8,7,3] m. The Reverberation time is set to be 0.3 s, 0.6 s, and 0.9 s. The distance between loudspeakers microphones are set to be 2.0 m and 0.4 m, respectively. The distance between each speaker position and the center of microphones is set to be [0.3, 0.7, 1.1]0.7 m. The near-end speech is mixed with the echo signals at a signal-to-echo ratio (SER) randomly chosen from [0, 5, 10, 15] dB.…”
Section: Methodsmentioning
confidence: 99%
“…Then the 320-point STFT is applied, leading to a 161-dimensional spectral feature in each frame. The enconders of three modules have five layers with the number of channels are [8,16,32,64,128] for each layer in turn. Accordingly, the number of chanels of decoders are [64,32,16,8,1].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Such a phenomenon is called acoustic masking [45]. High SPL sound, called masker, makes low SPL sound, called maskee, inaudible.…”
Section: Principles Of Psychoacoustic Maskingmentioning
confidence: 99%
“…ITD is more useful for low-frequency sounds. On average, maximum ITD is about 690 ms, when the sound source is close to one ear [45]. However, humans' heads are not the same size.…”
Section: Binaural Algorithmsmentioning
confidence: 99%