Research on speech and emotion is moving from a period of exploratory research into one where there is a prospect of substantial applications, notably in human-computer interaction. Progress in the area relies heavily on the development of appropriate databases. This paper addresses four main issues that need to be considered in developing databases of emotional speech: scope, naturalness, context and descriptors. The state of the art is reviewed. A good deal has been done to address the key issues, but there is still a long way to go. The paper shows how the challenge of developing appropriate databases is being addressed in three major recent projects--the Reading-Leeds project, the Belfast project and the CREST-ESP project. From these and other studies the paper draws together the tools and methods that have been developed, addresses the problems that arise and indicates the future directions for the development of emotional speech databases. Ó 2002 Elsevier Science B.V. All rights reserved.Re esume e LÕe etude de la parole et de lÕe emotion, partie du stade de la recherche exploratrice, en arrive maintenant au stade qui est celui dÕapplications importantes, notamment dans lÕinteraction homme-machine. Le progre es en ce domaine de epend e etroitment du de eveloppement de bases de donne ees approprie ees. Cet article aborde quatre points principaux qui me eritent notre attention a a ce sujet: lÕe etendue, lÕauthenticite e, le contexte et les termes de description. Il pre esente un compte-rendu de la situation actuelle dans ce domaine et e evoque les avance ees faites, et celles qui restent a a faire. LÕarticle montre comment trois re ecents projets importants (celui de Reading-Leeds, celui de Belfast, et celui de CREST-ESP) ont releve e le de efi pose e par la construction de bases de donne ees approprie ees. A partir de ces trois projets, ainsi que dÕautres travaux, les auteurs pre esentment un bilan des outils et me ethodes utilise es, identifient les proble emes qui y sont associe es, et indiquent la direction dans laquelle devraient sÕorienter les recherches a a venir.
Abstract. The HUMAINE project is concerned with developing interfaces that will register and respond to emotion, particularly pervasive emotion (forms of feeling, expression and action that colour most of human life). The HUMAINE Database provides naturalistic clips which record that kind of material, in multiple modalities, and labelling techniques that are suited to describing it.
For many applications of emotion recognition, such as virtual agents, the system must select responses while the user is speaking. This requires reliable on-line recognition of the user's affect. However most emotion recognition systems are based on turnwise processing. We present a novel approach to on-line emotion recognition from speech using Long Short-Term Memory Recurrent Neural Networks. Emotion is recognised frame-wise in a two-dimensional valence-activation continuum. In contrast to current state-of-the-art approaches, recognition is performed on low-level signal frames, similar to those used for speech recognition. No statistical functionals are applied to low-level feature contours. Framing at a higher level is therefore unnecessary and regression outputs can be produced in real-time for every low-level input frame. We also investigate the benefits of including linguistic features on the signal frame level obtained by a keyword spotter.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.