2013 1st International Conference on Communications, Signal Processing, and Their Applications (ICCSPA) 2013
DOI: 10.1109/iccspa.2013.6487281
|View full text |Cite
|
Sign up to set email alerts
|

Linear Regression-based Classifier for audio visual person identification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0
1

Year Published

2013
2013
2021
2021

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 15 publications
0
7
0
1
Order By: Relevance
“…MFCCs have been widely used AV for person recognition [6], [7], [16], [23], [35], [46], [69], [73], [89], [94], [123], [125], [144], [145]. Classification methods based on Gaussian Mixture Models (GMMs), Vector Quantization (VQ) have displayed a consistent speaker recognition performance using MFCCs.…”
Section: ) Cepstral Coefficientsmentioning
confidence: 99%
See 2 more Smart Citations
“…MFCCs have been widely used AV for person recognition [6], [7], [16], [23], [35], [46], [69], [73], [89], [94], [123], [125], [144], [145]. Classification methods based on Gaussian Mixture Models (GMMs), Vector Quantization (VQ) have displayed a consistent speaker recognition performance using MFCCs.…”
Section: ) Cepstral Coefficientsmentioning
confidence: 99%
“…When the face is more occluded, Haar cascade classifiers are used for detecting the eye portion of the image. An integral image representation that reduces time complexity and uses Haar-based features to perform AV person identification in [6]. Further, K-SVD (Single Value Decomposition) algorithm is used to create a dictionary for every video sample [105] by taking advantage of high redundancy between the video frames.…”
Section: ) Convolution Kernel Based Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…We used the LRC-GMM-UBM and LRC-ROI-RAW frameworks that we previously used in our works in [23,24] as the matchers of the audio and visual modalities, respectively. The main concept is that the samples from a specific user lie on a linear subspace, and therefore the task of person identification is considered to be a linear regression problem [25].…”
Section: Systemmentioning
confidence: 99%
“…In the field of visual-audio dual-modality biometrics, the general pipelines follow three stages including raw feature extraction, feature fusion for joint representation, and identity decision. Modality fusion [5] can be categorized as feature-level (early) [6], [7], classifier-level (intermediate) [8], [9], [10], [11] or score/decision-level (late) [12], [13], [14] fusion. Existing visual-audio biometrics algorithms are listed in Table I.…”
Section: A Related Workmentioning
confidence: 99%