A novel method for enhancing the performance of elastic graph matching in frontal face authentication is proposed. The starting point i s t o w eigh the local similarity v alues at the nodes of an elastic graph according to their discriminatory power. Powerful and well-established optimization techniques are used to derive the weights of the linear combination. More speci cally, w e propose a novel approach that reformulates Fisher's discriminant ratio to a quadratic optimization problem subject to a set of inequality constraints by c o m bining statistical pattern recognition and Support Vector Machines. Both linear and nonlinear Support Vector Machines are then constructed to yield the optimal separating hyperplanes and the optimal polynomial decision surfaces, respectively. The method has been applied to frontal face authentication on the M2VTS database. Experimental results indicate that the performance of morphological elastic graph matching, is highly improved by using the proposed weighting technique.
A novel elastic graph matching procedure based on multiscale morphological operations, the so called morphological dynamic link architecture, is developed for frontal face authentication. Fast algorithms for implementing mathematical morphology operations are presented. Feature selection by employing linear projection algorithms is proposed. Discriminatory power coefficients that weigh the matching error at each grid node are derived. The performance of morphological dynamic link architecture in frontal face authentication is evaluated in terms of the receiver operating characteristic on the M2VTS face image database. Preliminary results for face recognition using the proposed technique are also presented.Index Terms-Dynamic link architecture, face authentication, linear discriminant analysis, multiscale mathematical morphology, principal component analysis, receiver operating characteristics.
Our purpose is to design a useful tool which can be used in psychology to automatically classify utterances into five emotional states such as anger, happiness, neutral, sadness, and surprise. The major contribution of the paper is to rate the discriminating capability of a set of features for emotional speech recognition. A total of 87 features has been calculated over 500 utterances from the Danish Emotional Speech database. The Sequential Forward Selection method (SFS) has been used in order to discover a set of 5 to 10 features which are able to classify the utterances in the best way. The criterion used in SFS is the crossvalidated correct classification score of one of the following classifiers: nearest mean and Bayes classifier where class pdfs are approximated via Parzen windows or modelled as Gaussians. After selecting the 5 best features, we reduce the dimensionality to two by applying principal component analysis. The result is a 51.6% ± 3% correct classification rate at 95% confidence interval for the five aforementioned emotions, whereas a random classification would give a correct classification rate of 20%. Furthermore, we find out those twoclass emotion recognition problems whose error rates contribute heavily to the average error and we indicate that a possible reduction of the error rates reported in this paper would be achieved by employing two-class classifiers and combining them.
The paper presents results of the face verification contest that was organized in conjunction with International Conference on Pattern Recognition 2000 [14]. Participants had to use identical data sets from a large, publicly available multimodal database XM2VTSDB. Training and evaluation was carried out according to an a priori known protocol ([7]). Verification results of all tested algorithms have been collected and made public on the XM2VTSDB website [15], facilitating large scale experiments on classifier combination and fusion. Tested methods included, among others, representatives of the most common approaches to face verificationelastic graph matching, Fisher's linear discriminant and Support vector machines.
This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker clustering, deterministic and probabilistic algorithms are examined. A comparative assessment of the reviewed algorithms is undertaken, the algorithm advantages and disadvantages are indicated, insight to the algorithms is offered, and deductions as well as recommendations are given. Rich transcription and movie analysis are candidate applications that benefit from combined speaker segmentation and clustering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.