Gellért Sárosi scite author profile

This paper summarizes our recent efforts made to transcribe real-life Call Center conversations automatically with respect to non-verbal acoustic events, as well. Future Call Centers-as cognitive infocom systems-must respond automatically not only for well formed utterances but also for spontaneous and non-word speaker manifestations and must be robust against sudden noises. Conversational telephony speech transcription itself is a big challenge, primarily we address this issue on real-life (Bank and Insurance) tasks. In addition, we introduce several non-word acoustic modeling approaches and their integration to LVCSR (Large Vocabulary Continuous Speech Recognition). In the experiments, one and two channel (client and agent speech merged into one or left in two separate audio stream) transcription results, cross-task results and the handling of transcription data insufficiency are investigated-in parallel with the non-verbal acoustic event modeling. On the agent side less than 15% word error rate could be achieved and the best error rate reduction is 20% (relative) due to the inclusion of various written corpora and due to acoustic event handling.

show abstract

Recognition of Multiple Language Voice Navigation Queries in Traffic Situations

Sárosi

Mozsolics

Tarján

et al. 2011

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Gellért Sárosi

Comparison of feature extraction methods for speech recognition in noise-free and in traffic noise environment

Improved recognition of Hungarian call center conversations

On modeling non-word events in Large Vocabulary Continuous Speech Recognition

Automated transcription of conversational Call Center speech – with respect to non-verbal acoustic events

Recognition of Multiple Language Voice Navigation Queries in Traffic Situations

Contact Info

Product

Resources

About