16th Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2003)
DOI: 10.1109/sibgra.2003.1241036
|View full text |Cite
|
Sign up to set email alerts
|

Visual speech recognition: a solution from feature extraction to words classification

Abstract: Audio-visual Speech Recognition has been an active area of research lately. A bit, and yet unsolved, part of this problem is the visual only recognition, or lip reading. Considering an image sequence of a person pronouncing a word, a full image analysis solution would have to segment the mouth area, extract relevant features, and use them to be able to classify the word from those visual features. In this paper we approach this problem by proposing a segmentation technique for the lips contours together with a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(16 citation statements)
references
References 11 publications
0
14
0
Order By: Relevance
“…This approach is based upon the follow ideas: when a person is speaking, the human face is quiescent relative to the camera; the lip motion in an image sequence presents high frequency in comparison to other parts of the human face [11]. An image sequence of mandarin Chinese words' pronunciations is shown in …”
Section: Lip Featuresmentioning
confidence: 99%
“…This approach is based upon the follow ideas: when a person is speaking, the human face is quiescent relative to the camera; the lip motion in an image sequence presents high frequency in comparison to other parts of the human face [11]. An image sequence of mandarin Chinese words' pronunciations is shown in …”
Section: Lip Featuresmentioning
confidence: 99%
“…In this regard, the feature extraction techniques that have been applied in the development of VSR systems can be divided into two main categories, shape-based and intensity based. In general, the shape-based feature extraction techniques attempt to identify the lips in the image based either on geometrical templates that encode a standard set of mouth shapes [17] or on the application of active contours [3]. Since these approaches require extensive training to sample the spectrum of mouth shapes, recently the feature extraction has been carried out in the intensity domain.…”
Section: Introductionmentioning
confidence: 99%
“…The visual information is effective to improve the performance of recognition accuracy in noisy environments. For lip reading, some researchers proposed the method using the frontal face or side face image [9,10,5,3], or combining visual and auditory information [6,7,5,3]. In this paper, we focused to investigate the lip region and feature for lip reading.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, there are Japanese [7,8], English [6,5], French [9], and Portuguese [10], etc. However, there is no research to refer the language and the recognition method.…”
Section: Introductionmentioning
confidence: 99%