2012
DOI: 10.1186/1687-6180-2012-51
|View full text |Cite
|
Sign up to set email alerts
|

Multi-pose lipreading and audio-visual speech recognition

Abstract: In this article, we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We focus on overcoming the effects of a changing pose of the speaker, a problem encountered in natural situations where the speaker moves freely and does not keep a frontal pose with relation to the camera. To handle these situations, we introduce a pose normalization block in a standard system and generate virtual frontal views from non-frontal images. The proposed method is inspired … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
1
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(3 citation statements)
references
References 44 publications
0
3
0
Order By: Relevance
“…Different views of the speaker were used for lip reading using a pose normalization block in a standard system. The effects of pose normalization on the audiovisual integration strategy are analyzed by AV-ASR [16].…”
Section: Literature Surveymentioning
confidence: 99%
“…Different views of the speaker were used for lip reading using a pose normalization block in a standard system. The effects of pose normalization on the audiovisual integration strategy are analyzed by AV-ASR [16].…”
Section: Literature Surveymentioning
confidence: 99%
“…Lucey et al [12] apply a linear mapping to transform profile view features to frontal view features. This approach has been extended to map other views like 30°, 45°and 60°to the frontal view [13] or to the 30°v iew [14]. However, the performance is degraded as the number of features to be generated by the linear mapping increases [12].…”
Section: Introductionmentioning
confidence: 99%
“…However, there is a lot of interest to address, for example, the problem of head pose, which is a large hindrance in the application to real-world scenarios. Various works have already addressed this problem by taking different view angles into account [18,8]. To this end, several databases have been recorded simultaneously with cameras at different angles [17,11].…”
Section: Introductionmentioning
confidence: 99%