2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2015
DOI: 10.1109/asru.2015.7404833
|View full text |Cite
|
Sign up to set email alerts
|

The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
3
2
1

Relationship

3
3

Authors

Journals

citations
Cited by 35 publications
(45 citation statements)
references
References 16 publications
0
45
0
Order By: Relevance
“…Concerning other parts of the decoder, Hori et al (2015) reported consistent improvements on real and simulated data by replacing the default 3-gram language model used in the baseline by a 5-gram language model with Kneser-Ney (KN) smoothing (Kneser and Ney, 1995), rescoring the lattice using a recurrent neural network language model (RNN-LM) (Mikolov et al, 2010), and fusing the outputs of multiple systems using MBR. This claim also holds true for system combination based on recognizer output voting error reduction (ROVER) (Fiscus, 1997), as reported by Fujita et al (2015).…”
Section: Language Modeling and Rover Fusionmentioning
confidence: 99%
See 3 more Smart Citations
“…Concerning other parts of the decoder, Hori et al (2015) reported consistent improvements on real and simulated data by replacing the default 3-gram language model used in the baseline by a 5-gram language model with Kneser-Ney (KN) smoothing (Kneser and Ney, 1995), rescoring the lattice using a recurrent neural network language model (RNN-LM) (Mikolov et al, 2010), and fusing the outputs of multiple systems using MBR. This claim also holds true for system combination based on recognizer output voting error reduction (ROVER) (Fiscus, 1997), as reported by Fujita et al (2015).…”
Section: Language Modeling and Rover Fusionmentioning
confidence: 99%
“…In the following, we do not discuss DNN post-filters, which provided a limited improvement or degradation on both real and simulated data (Hori et al, 2015;Sivasankaran et al, 2015), and we focus on multichannel DNN-based enhancement instead. Table 5 illustrates the performance of the DNN-based time-invariant generalized eigenvalue (GEV) beamformer proposed by Heymann et al (2015).…”
Section: Dnn-based Beamforming and Separationmentioning
confidence: 99%
See 2 more Smart Citations
“…Heymann et al (2015) employ a DNN to perform the necessary speech and noise covariance estimates. Other teams have employed a conventional delay and sum beamformer (e.g., Sivasankaran et al, 2015;Hori et al, 2015;Prudnikov et al, 2015). Of these, several reported that the freely available BeamformIt tool developed by Anguera et al (2007) worked very effectively.…”
Section: Target Enhancementmentioning
confidence: 99%