2021
DOI: 10.1109/access.2021.3063031
|View full text |Cite
|
Sign up to set email alerts
|

Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey

Abstract: Biometric recognition is a trending technology that uses unique characteristics data to identify or verify/authenticate security applications. Amidst the classically used biometrics, voice and face attributes are the most propitious for prevalent applications in day-to-day life because they are easy to obtain through restrained and user-friendly procedures. The pervasiveness of low-cost audio and face capture sensors in smartphones, laptops, and tablets has made the advantage of voice and face biometrics more … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(13 citation statements)
references
References 111 publications
(205 reference statements)
0
13
0
Order By: Relevance
“…These transformations are one-way functions used for the extracted features that enhance the diversity and unlinkability properties. The same biometric template can be suffered from different transformations for various services to forbidden cross-matching between stored biometrics in various cloud datasets [22][23][24][25][26][27]. Another type of cancelable biometric system is called a hybrid approach.…”
Section: Introductionmentioning
confidence: 99%
“…These transformations are one-way functions used for the extracted features that enhance the diversity and unlinkability properties. The same biometric template can be suffered from different transformations for various services to forbidden cross-matching between stored biometrics in various cloud datasets [22][23][24][25][26][27]. Another type of cancelable biometric system is called a hybrid approach.…”
Section: Introductionmentioning
confidence: 99%
“…Three modalities were considered due to their particular characteristics: voice, video feed and electroencephalography (EEG) signals. In a similar fashion as discussed in [14] for audiovisual biometric systems, the selection of the aforementioned modalities aims to take advantage "...of complimentary biometric information present between voice and face cues", and goes a step beyond by cross-relating to EEG biometric information present in the process of generating visually-evoked potentials, imagining speech and uttering-articulation. A total of 51 users volunteered, all Spanish-speaking Latinos, 26 males and 25 females, with ages between 16 and 61 years old (x = 29.75 , σ = 10.97); 43 claimed to be righthanded, 5 left-handed and 3 declared being ambidextrous.…”
Section: Introductionmentioning
confidence: 99%
“…A relevant characteristic of DL models is their ability to extract and process features directly from raw biometric data [32], although more complex information can be extracted using deeper models, as it is the case with deeply learned residual features [33]. In general, DL techniques achieve very high performance in both identification and verification cases [14], but with the associated complexity cost. In the first part of this section, we present two unimodal recognition experiments based on Convolutional Neural Networks (CNN), using the dataset BIOMEX-DB with voice and face information.…”
Section: A Introductionmentioning
confidence: 99%
“…For example, some smartphones come with fingerprint and some include face recognition. The captured uni-modal bio-metrics like face or iris comes with several problems like low quality, variations in pose, problem with illuminations, background noise, low spatial and temporal resolutions of video [18]. Therefore, this problem is addressed in multimodal biometrics by taking advantages of default sensors like camera and microphone.…”
Section: Introductionmentioning
confidence: 99%
“…The second challenge is from the presentation attacks or also called as spoofing attacks and indirect attacks which are comprehensively explained in [29] for face and in [18] for audio-visual. Presentation attacks are defined as the presentation to a biometric capture subsystem with the goal of interfering with the operation of the biometric system [12].…”
Section: Introductionmentioning
confidence: 99%