Fusion of Talking Face Biometric Modalities for Personal Identity Verification

Sanchez, Urko; Kittler, Josef

doi:10.1109/icassp.2006.1661465

Cited by 6 publications

(6 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As shown in Table I, the most commonly used database and protocol are XM2VTS [25] (used by 3 authors) and Lausanne Protocols [26] respectively. The best performance obtained using lip features only on this database is by [14](HTER of 13.35%). Multi-modal fusion with two face detectors and two audio systems [23] yields HTER of 0.15%.…”

Section: A Summary Of Relevant Workmentioning

confidence: 92%

“…A final point to note is that this experiment investigates how our system compares with the state-of-the-art benchmarks. The best baseline performance obtained using lip features only on this database was by [14](HTER of 13.35%) as shown in Table I. Multi-modal fusion with two face detectors and two audio systems [23] yielded HTER of 0.15% as shown in Table II.…”

Section: Evaluation Of the Locp-top Descriptor For Speaker Authentmentioning

confidence: 94%

“…Most deployed biometric systems are based on scenarios with cooperative users speaking fixed string passwords from a small vocabulary. These generally employ what is known as text-dependent(TD) systems [14]. Such constraints are quite reasonable and can greatly improve the system accuracy.…”

Section: Relevant Workmentioning

confidence: 99%

See 2 more Smart Citations

Local Ordinal Contrast Pattern Histograms for Spatiotemporal, Lip-Based Speaker Authentication

Chan

Goswami

Kittler

et al. 2012

IEEE Trans.Inform.Forensic Secur.

View full text Add to dashboard Cite

Lip region deformation during speech contains biometric information and is termed visual speech. This biometric information can be interpreted as being genetic or behavioural depending on whether static or dynamic features are extracted. In this paper, we use a texture descriptor called Local Ordinal Contrast Pattern (LOCP) with a dynamic texture representation called Three Orthogonal Planes (TOP) to represent both the appearance and dynamics features observed in visual speech. This feature representation, when used in standard speaker verification engines, is shown to improve the performance of the lip-biometric trait compared to the state-of-the-art. The best baseline state-of-the-art performance was Half Total Error Rate (HTER) of 13.35% for the XM2VTS database. We obtained HTER of less than 1%. The resilience of the LOCP texture descriptor to random image noise is also investigated. Finally, the effect of the amount of video information on speaker verification performance suggests that with the proposed approach, speaker identity can be verified with a much shorter biometric trait record than the length normally required for voice-based biometrics. In summary, the performance obtained is remarkable and suggests that there is enough discriminative information in the mouthregion to enable its use as a primary biometric trait.

show abstract

Section: A Summary Of Relevant Workmentioning

confidence: 92%

Section: Evaluation Of the Locp-top Descriptor For Speaker Authentmentioning

confidence: 94%

See 1 more Smart Citation

Local Ordinal Contrast Pattern Histograms for Spatiotemporal, Lip-Based Speaker Authentication

Chan

Goswami

Kittler

et al. 2012

IEEE Trans.Inform.Forensic Secur.

View full text Add to dashboard Cite

show abstract

“…The best performance obtained using lip features only on this database is by [14](HTER of 13.35%). Multi-modal fusion with two face detectors and two audio systems [23] yields HTER of 0.15%.…”

Section: A Summary Of Relevant Workmentioning

confidence: 92%

Local Ordinal Contrast Pattern histograms for spatiotemporal, lip-based speaker authentication

Goswami

Chan

Kittler

et al. 2010

2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS)

View full text Add to dashboard Cite

Abstract-Lip region deformation during speech contains biometric information and is termed visual speech. This biometric information can be interpreted as being genetic or behavioural depending on whether static or dynamic features are extracted. In this paper, we use a texture descriptor called Local Ordinal Contrast Pattern (LOCP) with a dynamic texture representation called Three Orthogonal Planes (TOP) to represent both the appearance and dynamics features observed in visual speech. This feature representation, when used in standard speaker verification engines, is shown to improve the performance of the lip-biometric trait compared to the state-of-the-art. The best baseline state-of-the-art performance was Half Total Error Rate (HTER) of 13.35% for the XM2VTS database. We obtained HTER of less than 1%. The resilience of the LOCP texture descriptor to random image noise is also investigated. Finally, the effect of the amount of video information on speaker verification performance suggests that with the proposed approach, speaker identity can be verified with a much shorter biometric trait record than the length normally required for voice-based biometrics. In summary, the performance obtained is remarkable and suggests that there is enough discriminative information in the mouthregion to enable its use as a primary biometric trait.

show abstract

“…The first case study, which illustrates the merit of both multimodal and intramodal fusion, detailed in [43], involves the fusion of face, voice and lip dynamics biometric modalities. The system which used off-the-shelf conventional technologies was evaluated on the XM2VTS data base [44] producing the results obtained according to the Lausanne Experimental Protocol in Configuration I [44], as shown in Table I.…”

Section: Benefits Of Multiple Biometric Expert Fusionmentioning

confidence: 99%

Multibiometrics for Identity Authentication: Issues, Benefits and Challenges

Kiltter

Poh

2008

2008 IEEE Second International Conference on Biometrics: Theory, Applications and Systems

View full text Add to dashboard Cite

Abstract-Multi biometric systems exploit different biometric traits, multiple samples and multiple algorithms to establish the identity of an individual. Over any single biometric system, they have the advantage of increasing the population coverage, offering user choice, making biometric authentication systems more reliable and resilient to spoofing, and most importantly, improving the authentication performance. However, both the design and deployment of multi biometric systems raise many issues. These include system architecture, fusion methodology, selection of component biometric experts based on their accuracy and diversity, measurement of their quality, reliability and competence, as well as overall system usability, and economic viability. These issues will be addressed and possible ways forward discussed.

show abstract

Fusion of Talking Face Biometric Modalities for Personal Identity Verification

Cited by 6 publications

References 12 publications

Local Ordinal Contrast Pattern Histograms for Spatiotemporal, Lip-Based Speaker Authentication

Local Ordinal Contrast Pattern Histograms for Spatiotemporal, Lip-Based Speaker Authentication

Local Ordinal Contrast Pattern histograms for spatiotemporal, lip-based speaker authentication

Multibiometrics for Identity Authentication: Issues, Benefits and Challenges

Contact Info

Product

Resources

About