DOI: 10.1007/978-3-540-68585-2_42
|View full text |Cite
|
Sign up to set email alerts
|

The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System

Abstract: We describe the latest version of the SRI-ICSI meeting and lecture recognition system, as was used in the NIST RT-07 evaluations, highlighting improvements made over the last year. Changes in the acoustic preprocessing include updated beamforming software for processing of multiple distant microphones, and various adjustments to the speech segmenter for close-talking microphones. Acoustic models were improved by the combined use of neuralnet-estimated phone posterior features, discriminative feature transforms… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
26
0

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 47 publications
(28 citation statements)
references
References 16 publications
2
26
0
Order By: Relevance
“…We also observe the expected results, which have also been earlier observed in the literature [23] [5], that model level adaptation improves performance.…”
Section: Experiments and Resultssupporting
confidence: 90%
See 1 more Smart Citation
“…We also observe the expected results, which have also been earlier observed in the literature [23] [5], that model level adaptation improves performance.…”
Section: Experiments and Resultssupporting
confidence: 90%
“…In practice, it is common for meeting ASR that a well trained acoustic model is first obtained using clean speech data (conversational telephone speech, broadcast news), which is then adapted by using the meeting speech both from close talking microphone (nearfield) as well as distant microphone speech after enhancing the speech by delay-sum beamforming [5] or superdirective beamforming [7]. This approach has been shown to perform well.…”
Section: Introductionmentioning
confidence: 99%
“…We used SRI's Decipher (Stolcke et al, 2008) 9 to produce word confusion networks for our 17 meeting sub-corpus and then ran our detectors on the WCNs' best path. Table 6 shows a comparison of F-scores.…”
Section: Robustness To Asr Outputmentioning
confidence: 99%
“…These advanced techniques take into account the estimated noise or interfering signal characteristics for superior noise suppression capability [43,44]. In the context of ASR, beamforming techniques have been successfully exploited in the ICSI/SRI [45] and AMIDA [46] systems for transcriptions of meetings [47]. Another research efforts have explored unified multichannel-based speech recognition such as LIMABEAM and multi-channel-based neural networks speech recognizer.…”
Section: Multi-channel Integration In Acoustic Modelingmentioning
confidence: 99%