2012
DOI: 10.2478/v10175-012-0041-6
|View full text |Cite
|
Sign up to set email alerts
|

Characteristics of the use of coupled hidden Markov models for audio-visual polish speech recognition

Abstract: Abstract. This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of the highly disturbed audio speech signal. Recognition of audio-visual speech was based on combined hidden Markov models (CHMM). The described methods were developed for a single isolated command, nevertheless their effectiveness indicated that they would also work similarly in continuous audiovisual speech recognition. The problem of a visual speech analysis is very difficult and computationally d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 12 publications
0
8
0
Order By: Relevance
“…To encode the sounds using the RGB image, the MFCC coefficients [23,24] for the R component were applied, the time characteristic was used for the G component, and the signal source was used for the B component. Convolution of the three components R, G and B is performed in parallel.…”
Section: Time Frequency Convolutionmentioning
confidence: 99%
“…To encode the sounds using the RGB image, the MFCC coefficients [23,24] for the R component were applied, the time characteristic was used for the G component, and the signal source was used for the B component. Convolution of the three components R, G and B is performed in parallel.…”
Section: Time Frequency Convolutionmentioning
confidence: 99%
“…It is based on searching face elements which have significant value variations in the level of pixel brightness in small areas (so-called gradient method) [23][24][25]. There may be observed significant value variations in the eye area where the white of the eye is similar to the maximum white colour and the pupil to the maximum black colour, although the iris may contain also white light reflections.…”
Section: Detection Of Eye Area and Eye Characteristic Pointsmentioning
confidence: 99%
“…In case of eyes, the localization takes place by the checked method, based on searching elements on a face which have significant value variation in the level of pixel brightness in small areas (so called gradient method) [11,12]. Significant value variation can be observed in eye area where white of the eye is similar to maximum white colour and the pupil to maximum black colour, although the iris can contain also white light reflections.…”
Section: Detection Of Facial Asymmetry Pointsmentioning
confidence: 99%