Erinç Dikici scite author profile

Abstract-Discriminative language modeling (DLM) is a feature-based approach that is used as an error-correcting step after hypothesis generation in automatic speech recognition (ASR). We formulate this both as a classification and a ranking problem and employ the perceptron, the margin infused relaxed algorithm (MIRA) and the support vector machine (SVM). To decrease training complexity, we try count-based thresholding for feature selection and data sampling from the list of hypotheses. On a Turkish morphology based feature set we examine the use of first and higher order -grams and present an extensive analysis on the complexity and accuracy of the models with an emphasis on statistical significance. We find that we can save significantly from computation by feature selection and data sampling, without significant loss in accuracy. Using the MIRA or SVM does not lead to any further improvement over the perceptron but the use of ranking as opposed to classification leads to a 0.4% reduction in word error rate (WER) which is statistically significant. Index Terms-Discriminative language modeling (DLM), feature selection, data sampling, language modeling, ranking perceptron, ranking support vector machine (SVM), margin infused relaxed algorithm (MIRA), ranking MIRA, speech recognition.

show abstract

Semi-supervised and unsupervised discriminative language model training for automatic speech recognition

Dikici

Saraçlar

2016

Speech Communication

View full text Add to dashboard Cite

Speech and sliding text aided sign retrieval from hearing impaired sign news videos

Aran

Arı

Akarun

et al. 2008

J Multimodal User Interfaces

View full text Add to dashboard Cite

The objective of this study is to automatically extract annotated sign data from the broadcast news recordings for the hearing impaired. These recordings present an excellent source for automatically generating annotated data: In news for the hearing impaired, the speaker also signs with the hands as she talks. On top of this, there is also corresponding sliding text superimposed on the video. The video of the signer can be segmented via the help of either the speech or both the speech and the text, generating segmented, and annotated sign videos. We call this application O. Aran ( ) · I. Ari · L. Akarun as Signiary, and aim to use it as a sign dictionary where the users enter a word as text and retrieve sign videos of the related sign. This application can also be used to automatically create annotated sign databases that can be used for training recognizers.

show abstract

Semi-supervised discriminative language modeling for Turkish ASR

Çelebi

Sak

Dikici

et al. 2012

View full text Add to dashboard Cite

We present our work on semi-supervised learning of discriminative language models where the negative examples for sentences in a text corpus are generated using confusion models for Turkish at various granularities, specifically, word, subword, syllable and phone levels. We experiment with different language models and various sampling strategies to select competing hypotheses for training with a variant of the perceptron algorithm. We find that morph-based confusion models with a sample selection strategy aiming to match the error distribution of the baseline ASR system gives the best performance. We also observe that substituting half of the supervised training examples with those obtained in a semisupervised manner gives similar results.

show abstract

Automatic fingersign-to-speech translation system

Hrúz

Campr

Dikici

et al. 2011

J Multimodal User Interfaces

View full text Add to dashboard Cite

The aim of this paper is to help the communication of two people, one hearing impaired and one visually impaired by converting speech to fingerspelling and fingerspelling to speech. Fingerspelling is a subset of sign language, and uses finger signs to spell letters of the spoken or written language. We aim to convert finger spelled words to speech and vice versa. Different spoken languages and sign languages such as English, Russian, Turkish and Czech are considered.

show abstract

Performance comparison of training algorithms for semi-supervised discriminative language modeling

Dikici

Çelebi

Saraçlar

2012

View full text Add to dashboard Cite

Investigating the effect of data partitioning for GMM supervector based speaker verification

Dikici

Saraçlar

2009

View full text Add to dashboard Cite

Abstract-GMM supervectors are among the most popular feature sets used in SVM-based text-independent speaker verification systems. Most of the studies use only a single supervector to represent speaker characteristics, against a set of background samples. An alternative would be to divide the total training duration into smaller pieces to increase the number of supervectors for training the minority (speaker) class. Similarly, total test duration could also be partitioned, letting the final verification be made by majority voting over decisions on smaller durations. We explore the performance of speaker verification systems in terms of EER and minDCF by breaking down the input sequence into durations of 4 minutes, 1 minute and 10 seconds. We try different training/test data amounts to investigate the generalizability of this approach. Working on the CSLU Speaker Recognition Dataset, we show that the lowest error rates are obtained when the training supervector representative duration is set equal to that of the test samples.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Erinç Dikici

Sliding text recognition in broadcast news

Classification and Ranking Approaches to Discriminative Language Modeling for ASR

Semi-supervised and unsupervised discriminative language model training for automatic speech recognition

Speech and sliding text aided sign retrieval from hearing impaired sign news videos

Semi-supervised discriminative language modeling for Turkish ASR

Automatic fingersign-to-speech translation system

Performance comparison of training algorithms for semi-supervised discriminative language modeling

Investigating the effect of data partitioning for GMM supervector based speaker verification

Contact Info

Product

Resources

About