Kohei Yatabe scite author profile

This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processingbased approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.

show abstract

Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention

Koizumi

Yatabe

Delcroix

et al. 2020

106

View full text Add to dashboard Cite

This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance. Conventional studies of deep neural network (DNN)-based speech enhancement mainly focus on building a speaker independent model. Meanwhile, in speech applications including speech recognition and synthesis, it is known that model adaptation to the target speaker improves the accuracy. Our research question is whether a DNN for speech enhancement can be adopted to unknown speakers without any auxiliary guidance signal in test-phase. To achieve this, we adopt multi-task learning of speech enhancement and speaker identification, and use the output of the final hidden layer of speaker identification branch as an auxiliary feature. In addition, we use multi-head self-attention for capturing long-term dependencies in the speech and noise. Experimental results on a public dataset show that our strategy achieves the state-of-the-art performance and also outperform conventional methods in terms of subjective quality.

show abstract

Phase Reconstruction Based On Recurrent Phase Unwrapping With Deep Neural Networks

Masuyama

Yatabe

Oikawa

et al. 2020

View full text Add to dashboard Cite

Phase reconstruction, which estimates phase from a given amplitude spectrogram, is an active research field in acoustical signal processing with many applications including audio synthesis. To take advantage of rich knowledge from data, several studies presented deep neural network (DNN)-based phase reconstruction methods. However, the training of a DNN for phase reconstruction is not an easy task because phase is sensitive to the shift of a waveform. To overcome this problem, we propose a DNN-based two-stage phase reconstruction method. In the proposed method, DNNs estimate phase derivatives instead of phase itself, which allows us to avoid the sensitivity problem. Then, phase is recursively estimated based on the estimated derivatives, which is named recurrent phase unwrapping (RPU). The experimental results confirm that the proposed method outperformed the direct phase estimation by a DNN.

show abstract

Griffin–Lim Like Phase Recovery via Alternating Direction Method of Multipliers

Masuyama

Yatabe

Oikawa

2019

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Recovering a signal from its amplitude spectrogram, or phase recovery, exhibits many applications in acoustic signal processing. When only an amplitude spectrogram is available and no explicit information is given for the phases, the Griffin-Lim algorithm (GLA) is one of the most utilized methods for phase recovery. However, GLA often requires many iterations and results in low perceptual quality in some cases. In this letter, we propose two novel algorithms based on GLA and the alternating direction method of multipliers (ADMM) for better recovery with fewer iteration. Some interpretation of the existing methods and their relation to the proposed method are also provided. Evaluations are performed with both objective measure and subjective test.

show abstract

High-speed imaging of sound using parallel phase-shifting interferometry

et al. 2016

View full text Add to dashboard Cite

Sound-field imaging, the visualization of spatial and temporal distribution of acoustical properties such as sound pressure, is useful for understanding acoustical phenomena. This study investigated the use of parallel phase-shifting interferometry (PPSI) with a high-speed polarization camera for imaging a sound field, particularly high-speed imaging of propagating sound waves. The experimental results showed that the instantaneous sound field, which was generated by ultrasonic transducers driven by a pure tone of 40 kHz, was quantitatively imaged. Hence, PPSI can be used in acoustical applications requiring spatial information of sound pressure.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kohei Yatabe

Deep Griffin–Lim Iteration

Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention

Phase Reconstruction Based On Recurrent Phase Unwrapping With Deep Neural Networks

Griffin–Lim Like Phase Recovery via Alternating Direction Method of Multipliers

High-speed imaging of sound using parallel phase-shifting interferometry

Contact Info

Product

Resources

About