Automatic speaker recognition (ASR) systems are the field of Human-machine interaction and scientists have been using feature extraction and feature matching methods to analyze and synthesize these signals. One of the most commonly used methods for feature extraction is Mel Frequency Cepstral Coefficients (MFCCs). Recent researches show that MFCCs are successful in processing the voice signal with high accuracies. MFCCs represents a sequence of voice signal-specific features. This experimental analysis is proposed to distinguish Turkish speakers by extracting the MFCCs from the speech recordings. Since the human perception of sound is not linear, after the filterbank step in the MFCC method, we converted the obtained log filterbanks into decibel (dB) features-based spectrograms without applying the Discrete Cosine Transform (DCT). A new dataset was created with converted spectrogram into a 2-D array. Several learning algorithms were implemented with a 10-fold cross-validation method to detect the speaker. The highest accuracy of 90.2% was achieved using Multi-layer Perceptron (MLP) with tanh activation function. The most important output of this study is the inclusion of human voice as a new feature set.
Since the personal computer usage and internet bandwidth are increasing, e-learning systems are also widely spreading. Although e-learning has some advantages in terms of information accessibility, time and place flexibility compared to the formal learning, it does not provide enough face-to-face interactivity between an educator and learners. In this study, we are proposing a hybrid information system, which is combining computer vision and machine learning technologies for visual and interactive e-learning systems. The proposed information system detects emotional states of the learners and gives feedback to an educator about their instant and weighted emotional states based on facial expressions. In this way, the educator will be aware of the general emotional state of the virtual classroom and the system will create a formal learning-like interactive environment. Herein, several classification algorithms were applied to learn instant emotional state and the best accuracy rates were obtained using kNN and SVM algorithms.
Özet-Bilgisayar kullanımının yaygınlaştığı günümüzde, insan-bilgisayar etkileşimi ile ilgili yenilikçi çalışmalar hız kazanmıştır. Bu yeniliklerden bir tanesi de bilgisayar kullanıcısı bireylerin duygusal durumlarının, makine ile öğrenilmesidir. Ofis ortamlarında bilgisayarla çalışan bireylerin duygu durumlarının tespit edilebilmesi, özellikle bu kişilerin moral durumu ile iş performansı ilişkisi hakkında anlamlı bilgiler sunabilir. Bu fikirden hareketle, bilgisayar kullanıcısının yüz ifadelerine dayalı anlık duygu tespiti gerçekleştiren prototip bir sistem geliştirilmiştir. Geliştirilen bu sistem sırasıyla; yüz tespiti, yüz işaretçilerinin tespiti, yüz işaretçilerine dayalı özniteliklerden oluşan eğitim veri setinin oluşturulması ve kural-tabanlı sınıflandırıcı ile anlık duygusal durum tespitini gerçekleştirmektedir. Çalışmanın özgünlüğünü ifade eden özniteliklerin ayırt edici karakteristiğini anlamak amacıyla mevcut eğitim veri seti destek vektör makineleri ile durağan bir şekilde sınıflandırılmıştır. Sonuç olarak, sistemin başarımı 10-katlı çapraz doğrulama ile %96,1 olarak tespit edilmiştir. Anahtar Kelimeler-İnsan-bilgisayar etkileşimi, duygusal ifade tespiti, veri madenciliği The Detection of Emotional Expression towards Computer UsersAbstract-In these days, with the widespread use of computers, innovative studies in human-computer interaction have accelerated. One of these innovations is to learn one"s emotional states with the help of a machine. Ascertaining the emotional states of a person working in the office environment can present meaningful information about the correlation between his/her morale and work performance. Based on this idea, an emotion detection system has been developed as a prototype. This system initially performs face and facial landmark detections. Training data set is prepared using these landmarks and then rule-based classifier detects instant emotional states. The training data set has been classified statically with support vector machines to understand the distinctive characteristics of the attributes that express the originality of this study. As a result, the success of the system was detected 96.1% with 10-fold crossvalidation.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.