Missing data speech recognition in reverberant conditions

Palomaki,; Brown,; Barker,

doi:10.1109/icassp.2002.1005676

Cited by 4 publications

(4 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Is perception governed by the less distorted, 0.32-m bands in these sounds? This might happen if hearing behaves like a "missing data" speech recogniser, and bases its decisions on the less distorted parts of the signal (Palomäki, Brown and Barker 2002). Clearly, this could not be happening on a word by word basis with the listeners in this experiment, as there would be no effects of distance when only 4 of the test-word"s bands are given the 10-m reflection patterns.…”

Section: Ish 2009mentioning

confidence: 91%

Room Reflections and Constancy in Speech-Like Sounds: Within-Band Effects

Watkins

Raimond²,

Makin³

2010

The Neurophysiological Bases of Auditory Perception

View full text Add to dashboard Cite

textThe experiment asks whether constancy in hearing precedes or follows grouping. Listeners heard speech-like sounds comprising 8 auditory-filter shaped noise-bands that had temporal envelopes corresponding to those arising in these filters when a speech message is played. The "context" words in the message were "next you"ll get _to click on", into which a "sir" or "stir" test word was inserted. These test words were from an 11-step continuum that was formed by amplitude modulation. Listeners identified the test words appropriately and quite consistently, even though they had the "robotic" quality typical of this type of 8-band speech. The speech-like effects of these sounds appears to be a consequence of auditory grouping. Constancy was assessed by comparing the influence of room reflections on the test word across conditions where the context had either the same level of reflections, or where it had a much lower level. Constancy effects were obtained with these 8-band sounds, but only in "matched" conditions, where the room reflections were in the same bands in both the context and the test word. This was not the case in a comparison "mismatched" condition, and here, no constancy effects were found. It would appear that this type of constancy in hearing precedes the across-channel grouping whose effects are so apparent in these sounds. This result is discussed in terms of the ubiquity of grouping across different levels of representation.

show abstract

Section: Ish 2009mentioning

confidence: 91%

Room Reflections and Constancy in Speech-Like Sounds: Within-Band Effects

Watkins

Raimond²,

Makin³

2010

The Neurophysiological Bases of Auditory Perception

View full text Add to dashboard Cite

show abstract

“…Therefore, our future work consists of experimenting with multi-channel acoustic models in the MDT framework. Also, we will investigate the effectiveness of measures against reverberation [16].…”

Section: Discussionmentioning

confidence: 99%

Application of noise robust MDT speech recognition on the SPEECON and speechdat-car databases

et al. 2009

View full text Add to dashboard Cite

“…• It is assumed that the speech signal is disturbed by additive possibly nonstationary background noise. Significant reverberation cannot be handled, though mask estimation methods to handle reverberated speech are described in the literature [18,19]. If the user wants robustness against reverberation, he will need to implement his own mask estimation technique.…”

Section: Extension To a Missing Data Theory Based Recognisermentioning

confidence: 99%

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Wambacq

Demuynck

Compernolle

2012

Essential Speech and Language Technology for Dutch

View full text Add to dashboard Cite

Over the past years several users (in Belgium, the Netherlands and abroad) have adopted the ESAT speech recognition software package (developed for over 15 years at ESAT, K.U.Leuven, [5, 10]) as they found that it satisfied their research needs better than other available packages. However, typical of organically grown software, the learning effort was considerable and documentation lacking. The software needed a major overhaul and this is accomplished with support from the STEVIN programme and a partnership consisting of Katholieke Universiteit Leuven, Radboud University Nijmegen, Twente University and TNO. At the same time the main weaknesses were addressed and the code base was modernised. This effort also addressed the need of the research community for a Dutch speech recognition system that can be used by non-specialists. We also found this to be an ideal moment to open up the code to the community at large. It is now distributed as open source for academic usage and at moderate cost for commercial exploitation. The toolkit, its documentation and references can be found at http://www.spraak.org. In this article details of the SPRAAK toolkit are given in several sections: Sect. 6.2 discusses the possible uses of the toolkit, Sect. 6.3 explains the features of the different components of the software, some benchmark results are given in Sect. 6.4, system requirements of the software are mentioned in Sect. 6.5, licensing and distribution is covered in Sect. 6.6, relation to other STEVIN projects in Sects. 6.7 and 6.8 addresses future work. Finally a conclusion is given in Sect. 6.9. This article tries to give as complete information as possible about SPRAAK and is therefore targeted at several audiences:

show abstract

Missing data speech recognition in reverberant conditions

Cited by 4 publications

References 6 publications

Room Reflections and Constancy in Speech-Like Sounds: Within-Band Effects

Room Reflections and Constancy in Speech-Like Sounds: Within-Band Effects

Application of noise robust MDT speech recognition on the SPEECON and speechdat-car databases

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Contact Info

Product

Resources

About