AUDIMUS.MEDIA: A Broadcast News Speech Recognition System for the European Portuguese Language

Meinedo, Hugo; Caseiro, Diamantino; Neto, João Paulo; Trancoso, Isabel

doi:10.1007/3-540-45011-4_2

Cited by 58 publications

(28 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The baseline LM presents, over the test set (March11-B), an average perplexity (PP) of 122, an OOV word rate of 1.29% and a WER of 28.1%. The WER is higher than the normal evaluation test sets [4].…”

Section: New Datasetsmentioning

confidence: 73%

“…For the work presented in this paper, we used the system reported in [4]. This European Portuguese broadcast news transcription system features a hybrid HMM/MLP system, using three MLPs, each of them associated with a different feature extraction process, where the MLPs are used to estimate the context independent posterior phone probabilities given the acoustic data at each frame.…”

Section: Broadcast News Transcrition Systemmentioning

confidence: 99%

See 1 more Smart Citation

Dynamic Vocabulary Adaptation for a daily and real-time Broadcast News Transcription System

Martins

Texeira

Neto

2006

2006 IEEE Spoken Language Technology Workshop

View full text Add to dashboard Cite

The daily and real-time transcription of Broadcast News (BN) is a challenging task both in acoustic and in language modeling. To achieve optimal performance, several problems have to be overcome. Particularly, when transcribing BN data in highly inflected languages, the vocabulary growth leads to high OOV word rates. To address this problem, we propose a daily vocabulary and LM adaptation framework which directly extracts new words based on contemporary written news available on the Internet and some linguistic knowledge about the words found on those news. Experiments have been carried out for a European Portuguese BN transcription system. Preliminary results computed on 7 shows, yields a relative reduction of 61% in OOV and 2.1% in WER.

show abstract

Section: New Datasetsmentioning

confidence: 73%

Section: Broadcast News Transcrition Systemmentioning

confidence: 99%

Dynamic Vocabulary Adaptation for a daily and real-time Broadcast News Transcription System

Martins

Texeira

Neto

2006

2006 IEEE Spoken Language Technology Workshop

View full text Add to dashboard Cite

show abstract

“…The phone recognizer is part of the AUDIMUS system [11], a hybrid recognizer that combines the temporal modeling capabilities of hidden Markov models with the pattern discriminative classification abilities of multi-layer Perceptrons. This phonetic decoding is applied to all the languages in the training database, resulting in Portuguese-phones sequences which are then modeled for each language by n-grams, using the SRI language modeling toolkit [12].…”

Section: Prlm Systemmentioning

confidence: 99%

Portuguese variety identification on broadcast news

Rouas¹,

Trancoso

Viana

2008

2008 IEEE International Conference on Acoustics, Speech and Signal Processing

Self Cite

View full text Add to dashboard Cite

This paper describes an accent identification system for Portuguese, that explores different type of properties: acoustic, phonotactic and prosodic. The system is designed to be used as a pre-processing module for the Portuguese Automatic Speech Recognition system developed at INESC-ID. In terms of variety identification, the overall rate of correct identification is 69.0% if all 7 varieties are considered, and the best results are obtained for Brazilian Portuguese, also the variety that proved easiest to identify in perceptual experiments. When distinguishing between European, Brazilian and African Portuguese, the identification rate goes up to 94.7%. The fact that the prosodic system alone can achieve an identification rate of 77% is also worth investigating.

show abstract

“…There are 4 main blocks in this diagram: the ASR, the TTS, the FACE and the TM. The ASR is based on Audimus [5], a hybrid speech recognizer that combines the temporal modeling capabilities of Hidden Markov Models (HMMs) with the pattern discriminative classification capabilities of multilayer perceptrons (MLPs). This same recognizer is being used for different complexity tasks based on a common structure but with different components.…”

Section: Our Systemmentioning

confidence: 99%

“…This means that to control the devices the user has to start by the keyword "Ambrósio". The acoustic models of our Audimus [5] system are speaker independent.…”

Section: Asr Configurationmentioning

confidence: 99%

Design of a Multimodal Input Interface for a Dialogue System

Neto¹,

Cassaca²,

Viveiros³

et al. 2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. In this paper we described our initial work on the development of an embodied conversational agent platform. In the present stage our main focus it is on the development of a multimodal input interface to the system. In this paper we will present an Input and Output Manager block that combines speech, synthetic talking face, text and graphical interfaces. The system support speech input through an ASR and speech output through a TTS, synchronized with an animated face. The graphical and text input are feed through a Text Manger that it is a constituent component of the Input and Output Manager block. All the blocks are tailored for the European Portuguese language. The system is analyzed in the framework of the project Interactive Home of the Future.

show abstract

AUDIMUS.MEDIA: A Broadcast News Speech Recognition System for the European Portuguese Language

Cited by 58 publications

References 5 publications

Dynamic Vocabulary Adaptation for a daily and real-time Broadcast News Transcription System

Dynamic Vocabulary Adaptation for a daily and real-time Broadcast News Transcription System

Portuguese variety identification on broadcast news

Design of a Multimodal Input Interface for a Dialogue System

Contact Info

Product

Resources

About