Smart Home Entertainment System with Personalized Recommendation and Speech Emotion Recognition Support

Zheng, Weitao

doi:10.14257/ijsh.2016.10.8.14

Cited by 6 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Emotion detection using neural networks.-Voice can be characterized by various parameters such as pitch (indicating the level of highness/lowness of a tone) and frequency (indicating the variation in the pitch) which are useful for determining the emotion of a speaker. Building on earlier research on voice recognition (e.g., Pan et al, 2012;Gao et al, 2017;Likitha et al, 2017;Bhavan et al, 2019) MFCCs can be derived from Mel Spectrogram Frequencies, we find that using both types of features helps to improve the accuracy of the model. Note that the number of Mel spectrogram coefficients, MFCCs, and chroma coefficients can be adjusted to achieve more accurate predictions.…”

Section: A Voice Tonementioning

confidence: 59%

The Voice of Monetary Policy

Gorodnichenko

Pham

Talavera

2023

American Economic Review

View full text Add to dashboard Cite

We develop a deep learning model to detect emotions embedded in press conferences after the Federal Open Market Committee meetings and examine the influence of the detected emotions on financial markets. We find that, after controlling for the Federal Reserve’s actions and the sentiment in policy texts, a positive tone in the voices of Federal Reserve chairs leads to significant increases in share prices. Other financial variables also respond to vocal cues from the chairs. Hence, how policy messages are communicated can move the financial market. Our results provide implications for improving the effectiveness of central bank communications. (JEL D83, E31, E44, E52, E58, F31, G14)

show abstract

Section: A Voice Tonementioning

confidence: 59%

The Voice of Monetary Policy

Gorodnichenko

Pham

Talavera

2023

American Economic Review

View full text Add to dashboard Cite

show abstract

“…Tổ hợp các hệ số MFCC, LPCC (Linear Predictive Cepstral Coefficients), RASTA PLP (Relative Spectral Transform -Perceptual Linear Prediction) và các hệ số logarit của công suất đối với tần số đã được xem là tập các đặc điểm để phân loại các cảm xúc: tức giận, chán, bình thường, vui, buồn trong tiếng phổ thông Trung Quốc [11]. SVM cũng được dùng để nhận dạng 3 cảm xúc vui, buồn, bình thường của tiếng Trung Quốc [16] sử dụng các tham số như năng lượng, tần số cơ bản, LPCC, MFCC và MEDC (Mel-Energy spectrum Dynamic Coefficients). [17] sử dụng các tham số LPC, MFCC với thuật giải OSALPC (linear prediction of the causal part of the autocorrelation sequence algorithm) cho mô hình GMM (Gaussian Mixture Model) trên ngữ liệu tiếng Đức (Emo-DB) đạt được độ chính xác trung bình 89% cho 7 cảm xúc.…”

Section: Các Tham Số Về Cảm Xúc Trong Tiếng Nóiunclassified

Cảm xúc trong tiếng nói và phân tích thống kê ngữ liệu cảm xúc tiếng Việt

Thành¹,

Thủy²,

Loan³

et al. 2016

Công nghệ CNTT-TT

View full text Add to dashboard Cite

Research on emotional speech has been carried out for many languages over the world and for Vietnamese, there was a beginning. This paper describes some research results on main features of four basic emotions: happiness, sadness, anger and neutrality. Our preliminary research on emotions of Vietnamese shows that in general anger and happiness correspond to speech energy and fundamental frequency higher than the one of neutral emotion, the sad emotion has the lowest values for energy and fundamental frequency. These comments come from the statistical methods such as analysis of variance (ANOVA) and Tukey’s test applied for our Vietnamese emotion corpus. The classifiers SMO, lBk, trees J48 have been used for preliminary identification of emotions based on BKEmo corpus. The highest recognition rate is 98.17% for the classifier lBk using 384 feature parameters and this rate decreases to 82.59% for the case using only 48 parameters relating to the F0 and intensity.

show abstract

“…Speech signals have been analyzed on their own or combined with facial expression, gestures,and/or physiological signals to convey information about emotional states. More specifically, emotion recognition through speech has found increasing applications in various fields, including, but not limited to, healthcare [8], [9], [10], Human-Computer Interaction (HCI) [11], businesses [12], and entertainment [13].…”

Section: Introductionmentioning

confidence: 99%

Novel Speech-Based Emotion Climate Recognition in Peers’ Conversations Incorporating Affect Dynamics and Temporal Convolutional Neural Networks

Alhussein¹,

Alkhodari²,

Khandoker³

et al. 2023

Preprint

View full text Add to dashboard Cite

<p> Peers’ conversation provides a domain of rich emotional information. The latter, apart from facial and gestural expressions, it is also naturally conveyed via peers’ speech, contributing to the establishment of a dynamic emotion climate (EC) during their conversational interaction. Recognition of EC could provide an additional source in understating peers’ social interaction and behavior on top of peers’ actual conversational content. Here, we propose a novel approach for speech-based EC recognition, namely AffECt, by combining peers’ complex affect dynamics (AD) with deep features extracted from speech signals using Temporary Convolutional Neural Networks (TCNNs). AffECt was tested and cross-validated on data drawn from there open datasets, i.e., K-EmoCon, IEMOCAP, and SEWA, in terms of EC arousal/valence level classification. The experimental results have shown that AffECt achieves EC classification accuracy up to 83.3% and 80.2% for arousal and valence, respectively, clearly surpassing the results reported in the literature, exhibiting robust performance across different languages. Moreover, there is a distinct improvement when the AD are combined with the TCNN, compared to the baseline deep learning approaches. These results demonstrate the effectiveness of AffECt in speech-based EC recognition, paving the way for many applications, e.g., in patients’ group therapy, negotiations, and emotion-aware mobile applications </p>

show abstract

Smart Home Entertainment System with Personalized Recommendation and Speech Emotion Recognition Support

Abstract: Abstract

Cited by 6 publications

References 9 publications

The Voice of Monetary Policy

The Voice of Monetary Policy

Cảm xúc trong tiếng nói và phân tích thống kê ngữ liệu cảm xúc tiếng Việt

Novel Speech-Based Emotion Climate Recognition in Peers’ Conversations Incorporating Affect Dynamics and Temporal Convolutional Neural Networks

Contact Info

Product

Resources

About