In this work, an acoustic feature set based on a Gammatone filterbank is introduced for large vocabulary speech recognition. The Gammatone features presented here lead to competitive results on the EPPS English task, and considerable improvements were obtained by subsequent combination to a number of standard acoustic features, i.e. MFCC, PLP, MF-PLP, and VTLN plus voicedness. Best results were obtained when combining Gammatone features to all other features using weighted ROVER, resulting in a relative improvement of about 12% in word error rate compared to the best single feature system. We also found that ROVER gives better results for feature combination than both log-linear model combination and LDA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.