2010 IEEE International Symposium on Multimedia 2010
DOI: 10.1109/ism.2010.26
|View full text |Cite
|
Sign up to set email alerts
|

Parallelizing Speaker-Attributed Speech Recognition for Meeting Browsing

Abstract: The following article presents an application for browsing meeting recordings by speaker and keyword which we call the Meeting Diarist. The goal of the system is to enable browsing of the content with rich meta-data in a graphical user interface shortly after the end of meeting, even when the application runs on a contemporary laptop. We therefore developed novel parallel methods for speaker diarization and multi-hypothesis speech recognition that are optimized to run on multicore and manycore architectures. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…Once they are computed all processing tasks take place in the binary domain. Other works in speaker diarization concerned with speed include [28], [29] which achieve faster than real-time processing through the use of several processing tricks applied to a standard bottom-up approach ( [28]) or by parallelizing most of the processing in a GPU unit ( [29]). The need for efficient diarization systems is emphasized when processing very large databases or when using diarization as a preprocessing step to other speech algorithms.…”
Section: ) Bottom-up Approachmentioning
confidence: 99%
“…Once they are computed all processing tasks take place in the binary domain. Other works in speaker diarization concerned with speed include [28], [29] which achieve faster than real-time processing through the use of several processing tricks applied to a standard bottom-up approach ( [28]) or by parallelizing most of the processing in a GPU unit ( [29]). The need for efficient diarization systems is emphasized when processing very large databases or when using diarization as a preprocessing step to other speech algorithms.…”
Section: ) Bottom-up Approachmentioning
confidence: 99%
“…(1) where ˛is the filter coefficient in the range [0.95, 0.98] [17] , the pre-emphasized signal is then windowed using Hanning window(s) to improve the spectral representation of the speech vector [18] . Once the speech signal has been windowed and pre-emphasized, the Fast Fourier Transform (FFT) is calculated [15] .…”
Section: Speaker Recondition Systemmentioning
confidence: 99%
“…A previous analysis of the diarization engine [28] showed that it is subject to two main computational bottlenecks: the training of the Gaussian Mixture Models, mostly during the merging phase that requires n 2 comparisons to determine the cluster pair to merge [20], and the Viterbi alignment. In prior work [19], it was shown that for the engine used here, Viterbi alignment can be replaced by a local majority vote without a significant change in accuracy.…”
Section: Gmm Training On a Gpumentioning
confidence: 99%