“…The methods for boundary detection can be based on using bidirectional LSTM networks, 39,40 wavelet analysis, [42][43][44] graph-based structural analysis, 45 rules describing the power spectrum 46 or formants 47 and various features extracted from the spectrogram, for example, visual features 48,49 or auditory attention features. 50 The methods for boundary detection also have a relevant application in the task of segmentation with orthographic or phonetic transcription provided, where they can be used as additional boundary correction procedures.…”
Section: Related Workmentioning
confidence: 99%
“…The methods for boundary detection can be based on using bidirectional LSTM networks, 39,40 wavelet analysis, 42‐44 graph‐based structural analysis, 45 rules describing the power spectrum 46 or formants 47 and various features extracted from the spectrogram, for example, visual features 48,49 or auditory attention features 50 …”
This article describes experiments on speech segmentation using long short‐term memory recurrent neural networks. The main part of the paper deals with multi‐lingual and cross‐lingual segmentation, that is, it is performed on a language different from the one on which the model was trained. The experimental data involves large Czech, English, German, and Russian speech corpora designated for speech synthesis. For optimal multi‐lingual modeling, a compact phonetic alphabet was proposed by sharing and clustering phones of particular languages. Many experiments were performed exploring various experimental conditions and data combinations. We proposed a simple procedure that iteratively adapts the inaccurate default model to the new voice/language. The segmentation accuracy was evaluated by comparison with reference segmentation created by a well‐tuned hidden Markov model‐based framework with additional manual corrections. The resulting segmentation was also employed in a unit selection text‐to‐speech system. The generated speech quality was compared with the reference segmentation by a preference listening test.
“…The methods for boundary detection can be based on using bidirectional LSTM networks, 39,40 wavelet analysis, [42][43][44] graph-based structural analysis, 45 rules describing the power spectrum 46 or formants 47 and various features extracted from the spectrogram, for example, visual features 48,49 or auditory attention features. 50 The methods for boundary detection also have a relevant application in the task of segmentation with orthographic or phonetic transcription provided, where they can be used as additional boundary correction procedures.…”
Section: Related Workmentioning
confidence: 99%
“…The methods for boundary detection can be based on using bidirectional LSTM networks, 39,40 wavelet analysis, 42‐44 graph‐based structural analysis, 45 rules describing the power spectrum 46 or formants 47 and various features extracted from the spectrogram, for example, visual features 48,49 or auditory attention features 50 …”
This article describes experiments on speech segmentation using long short‐term memory recurrent neural networks. The main part of the paper deals with multi‐lingual and cross‐lingual segmentation, that is, it is performed on a language different from the one on which the model was trained. The experimental data involves large Czech, English, German, and Russian speech corpora designated for speech synthesis. For optimal multi‐lingual modeling, a compact phonetic alphabet was proposed by sharing and clustering phones of particular languages. Many experiments were performed exploring various experimental conditions and data combinations. We proposed a simple procedure that iteratively adapts the inaccurate default model to the new voice/language. The segmentation accuracy was evaluated by comparison with reference segmentation created by a well‐tuned hidden Markov model‐based framework with additional manual corrections. The resulting segmentation was also employed in a unit selection text‐to‐speech system. The generated speech quality was compared with the reference segmentation by a preference listening test.
“…is model is often called a biometric signature. With a view to propose a new approach aiming to differently extract iris parameters, our efforts gave rise to a contribution that consisted of defining a model of the iris represented by well-selected coefficients of Meyer wavelet transform [34]. Following several analyses, we noticed that multiscale Meyer wavelets presented undeniable results.…”
Section: Iris-texture Analysis and Biometric-signature Extraction Basmentioning
Current research in biometrics aims to develop high-performance tools, which would make it possible to better extract the traits specific to each individual and to grasp their discriminating characteristics. This research is based on high-level analyses of images, captured from the candidate to identify, for a better understanding and interpretation of these signals. Several biometric identification systems exist. The recognition systems based on the iris have many advantages and they are among the most reliable. In this paper, we propose a new approach based on biometric iris authentication. A new scheme was made in this work that consists of calculating a three-dimensional head pose to capture a good iris image from a video sequence which affects the identification results. From this image, we were able to locate the iris and analyse its texture by intelligent use of Meyer wavelets. Our approach was evaluated and approved through two databases CASIA Iris Distance and MiraclHB. The comparative study showed its effectiveness compared to those in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.