ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683705
|View full text |Cite
|
Sign up to set email alerts
|

A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(17 citation statements)
references
References 11 publications
0
17
0
Order By: Relevance
“…They pre-trained an attention-based encoder-decoder model to disentangle accentinvariant and accent-specific characteristics from acoustic features by adversarial training. Accent-dependent acoustic modeling approaches take accent-related information into network architecture by accent embedding, accent-specific bottleneck features or ivectors [20,21]. In a closed set of known accents, accent-dependent models usually outperform the accent-independent universal models, while the latter ones usually achieve a better average model under the situations where accent labels are unavailable.…”
Section: Related Workmentioning
confidence: 99%
“…They pre-trained an attention-based encoder-decoder model to disentangle accentinvariant and accent-specific characteristics from acoustic features by adversarial training. Accent-dependent acoustic modeling approaches take accent-related information into network architecture by accent embedding, accent-specific bottleneck features or ivectors [20,21]. In a closed set of known accents, accent-dependent models usually outperform the accent-independent universal models, while the latter ones usually achieve a better average model under the situations where accent labels are unavailable.…”
Section: Related Workmentioning
confidence: 99%
“…In both cases using one-hot dialect codes as an input augmentation (corresponding to bias adaptation) proved to be the best approach, and cluster-adaptive approaches did not result in a consistent gain. These approaches were extended by Yoo et al [227] and Viglino et al [223] who both explored the use of dialect embeddings for multi-accent end-to-end speech recognition. Ghorbani et al [228] used accent-specific teacherstudent learning, and Jain et al [229] explored a mixture of experts (MoE) approach, using mixtures of experts both at the phonetic and accent levels.…”
Section: Accent Adaptationmentioning
confidence: 99%
“…Yoo et al [227] also applied a method of feature-wise affine transformations on the hidden layers (FiLM), that are dependent both on the network's internal state and the dialect/accent code (discussed in Sec. VI).…”
Section: Accent Adaptationmentioning
confidence: 99%
“…Accent-robust ASR systems aim to mitigate the negative effects of non-native speech. A straightforward exploration is to build an accent-specific system where accent information, such as i-vectors, accent IDs, or accent embeddings, are explicitly fed into the neural networks along with acoustic features [5][6][7][8][9][10]. These approaches typically either adapt a unified model with accent-specific data, or build a separate decoder for each accent.…”
Section: Introductionmentioning
confidence: 99%