2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS) 2015
DOI: 10.1109/btas.2015.7358754
|View full text |Cite
|
Sign up to set email alerts
|

A deep neural network for audio-visual person recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(10 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…Some studies have attempted to enhance the quality of person recognition from two data sources (audio-visual data) using DBN and DBM [72] models, which have allowed several types of representation to be combined and coordinated. Some of these works include [48], [73]. According to Salakhutdinov et al [72], a DBM is a generative model that includes several layers of hidden variables.…”
Section: Human Recognitionmentioning
confidence: 99%
See 1 more Smart Citation
“…Some studies have attempted to enhance the quality of person recognition from two data sources (audio-visual data) using DBN and DBM [72] models, which have allowed several types of representation to be combined and coordinated. Some of these works include [48], [73]. According to Salakhutdinov et al [72], a DBM is a generative model that includes several layers of hidden variables.…”
Section: Human Recognitionmentioning
confidence: 99%
“…According to Salakhutdinov et al [72], a DBM is a generative model that includes several layers of hidden variables. In [48], the structure of deep multimodal Boltzmann machines (DMBM) [71] is similar to that of DBM, but it can admit more than one modality. Therefore, each modality will be covered individually using adaptive approaches.…”
Section: Human Recognitionmentioning
confidence: 99%
“…It is observed that MFCCs have performed better than others. Alam et al [4], [5] have explored the usage of MFCCs in deep neural network based methods. Further, MFCCs are also used in creating i-vectors, which performed better with Linear Discriminant Analysis (LDA) and Within Class Covariance Normalisation (WCCN) [105].…”
Section: ) Cepstral Coefficientsmentioning
confidence: 99%
“…LBP features are extracted on the detected faces for multimodal authentication in [124], [125]. Deep neural network based AV recognition systems [4] employed LBPs as visual features from face images that are photometrically normalized using the Tan-Triggs algorithm [128]. In further research, a joint deep Boltzmann machine (jDBM) model that uses LBPs is introduced with an improved performance [5].…”
Section: ) Texture Based Featuresmentioning
confidence: 99%
“…The majority of them are intended to detect and recognize targets and few are for cognitive development. For instance, some fusion networks learn visual images and sounds respectively using two branches of deep neural network and integrate them by connecting their vectors in series [19]- [21]. But these computational models have fixed topology and need to be trained with enormous data in an offline way.…”
Section: (B) the Process Mainly Involves Audiovisual Integration Andmentioning
confidence: 99%