Joseph Razik scite author profile

Joseph Razik

3Publications

20Citation Statements Received

1Citation Statement Given

How they've been cited

How they cite others

Affiliations

Université de Toulon, Télécom Paris, Laboratoire Traitement et Communication de l’Information

Publications

Order By: Most citations

Frame-Synchronous and Local Confidence Measures for Automatic Speech Recognition

Razik

Mella

Fohr

et al. 2011

Int. J. Patt. Recogn. Artif. Intell.

View full text Add to dashboard Cite

In this paper, we introduce two new confidence measures for large vocabulary speech recognition systems. The major feature of these measures is that they can be computed without waiting for the end of the audio stream. We proposed two kinds of confidence measures: frame-synchronous and local. The frame-synchronous ones can be computed as soon as a frame is processed by the recognition engine and are based on a likelihood ratio. The local measures estimate a local posterior probability in the vicinity of the word to analyze. We evaluated our confidence measures within the framework of the automatic transcription of French broadcast news with the EER criterion. Our local measures achieved results very close to the best state-of-the-art measure (EER of 23% compared to 22.0%). We then conducted a preliminary experiment to assess the contribution of our confidence measure in improving the comprehension of an automatic transcription for the hearing impaired. We introduced several modalities to highlight words of low confidence in this transcription. We showed that these modalities used with our local confidence measure improved the comprehension of automatic transcription.

show abstract

Sparse coding for scaled bioacoustics: From Humpback whale songs evolution to forest soundscape analyses

Glotin

Sueur

Artières

et al. 2013

View full text Add to dashboard Cite

The bioacoustic event indexing has to be scaled in space (oceans and large forests, multiple sensors), and in species number (thousand). We discuss why time-frequency featuring is inefficient compared to the sparse coding (SC) for soundscape analysis. SC is based on the principle that an optimal code should contain enough information to reconstruct the input near regions of high data density, and should not contain enough information to reconstruct inputs in regions of low data density. It has been shown that SC methods can be real-time. We illustrate with an application to humpack whale songs to determine stable components versus evolving ones across season and years. By sparsing at different time scale, the results show that the shortest humpack acoustic codes are the most stable (occurring with similar structure across two consecutive years). Another illustration is given on forest soundscape analysis, where we show that time-frequency atomes allow an easier analysis of forest sound organization, without initial classification of the events. These researches are developed within the interdisciplinary CNRS project “Scale Acoustic Biodiversity,” with Univ. of Toulon, Paris Natural History Museum, and Paris 6, consisting into efficient processes for conditioning and representing relevant bioacoustic. Information, with examples at sabiod.univ-tln.fr.

show abstract

Experiments on acoustic model supervised adaptation and evaluation by K-Fold Cross Validation technique

Caon

Amehraye²,

Razik

et al. 2010

View full text Add to dashboard Cite

This paper is an analysis of adaptation techniques for French acoustic models (hidden Markov models). The LVCSR engine Julius, the Hidden Markov Model Toolkit (HTK) and the K-Fold CV technique are used together to build three different adaptation methods: Maximum Likelihood a priori (ML) , Maximum Likelihood Linear Regression (MLLR) and Maximum a posteriori (MAP). Experimental results by means of word and phoneme error rate indicate that the best adaptation method depends on the adaptation data, and that the acoustic models performance can be improved by the use of alignments at phoneme-level and K-Fold Cross Validation (CV). The very known K-Fold CV technique will point to the best adaptation technique to follow considering each case of data type.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joseph Razik

Frame-Synchronous and Local Confidence Measures for Automatic Speech Recognition

Sparse coding for scaled bioacoustics: From Humpback whale songs evolution to forest soundscape analyses

Experiments on acoustic model supervised adaptation and evaluation by K-Fold Cross Validation technique

Contact Info

Product

Resources

About