Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-823
|View full text |Cite
|
Sign up to set email alerts
|

Phase-Aware Signal Processing for Automatic Speech Recognition

Abstract: Conventional automatic speech recognition (ASR) often neglects the spectral phase information in its front-end and feature extraction stages. The aim of this paper is to show the impact that enhancement of the noisy spectral phase has on ASR accuracy when dealing with speech signals corrupted with additive noise. Apart from proof-of-concept experiments using clean spectral phase, we also present a phase enhancement method as a phase-aware front-end and modified group delay as a phaseaware feature extractor, an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
3
0
9

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 34 publications
(43 reference statements)
0
3
0
9
Order By: Relevance
“…Speech enhancement methods have traditionally only dealt with filtering the spectral magnitudes, however many approaches have been recently been proposed for jointly enhancing the magnitude and phase spectra [1,8,9,10,11,12,13]. The prevalent method for estimating phase spectra from given magnitudes in speech synthesis is the one proposed by Griffin and Lim [14].…”
Section: Related Workmentioning
confidence: 99%
“…Speech enhancement methods have traditionally only dealt with filtering the spectral magnitudes, however many approaches have been recently been proposed for jointly enhancing the magnitude and phase spectra [1,8,9,10,11,12,13]. The prevalent method for estimating phase spectra from given magnitudes in speech synthesis is the one proposed by Griffin and Lim [14].…”
Section: Related Workmentioning
confidence: 99%
“…Além disso, pode-se ainda utilizar as estratégias discutidas em [8], [11] e [12] para o tratamento das raízes próximas a circunferência de raio unitário. Dentre elas, destacam-se as que utilizam bancos de filtros em MF.…”
Section: B Considerações Sobre a Gdfunclassified
“…Em [10], cepstros complexos são usados como atributos para a aplicação em conversores de texto para fala. Já em [5] e [8], atributos criados a partir das derivadas do espectro de fase no domínio do tempo e da frequência, representados pela frequência instantânea e pelo atraso de grupo (group delay -GD), respectivamente, são utilizados em aplicações de realce do sinal de fala e em sistemas de ASR.…”
Section: Introductionunclassified
See 1 more Smart Citation
“…Automatic speech recognition (ASR) aims to map an audio signal, containing speech, into a text transcription containing a sequence of words. Basically, the goal is to match the transcription as close as possible to the audio message, with no particular understanding of the meaning or scope of what was spoken [2].…”
Section: Introductionmentioning
confidence: 99%