The proliferation of MP3 players and the exploding amount of digital music content call for novel ways of music organization and retrieval to meet the ever-increasing demand for easy and effective information access. As almost every music piece is created to convey emotion, music organization and retrieval by emotion is a reasonable way of accessing music information. A good deal of effort has been made in the music information retrieval community to train a machine to automatically recognize the emotion of a music signal. A central issue of machine recognition of music emotion is the conceptualization of emotion and the associated emotion taxonomy. Different viewpoints on this issue have led to the proposal of different ways of emotion annotation, model training, and result visualization. This article provides a comprehensive review of the methods that have been proposed for music emotion recognition. Moreover, as music emotion recognition is still in its infancy, there are many open issues. We review the solutions that have been proposed to address these issues and conclude with suggestions for further research.
The rate-distortion optimization (RDO) framework for video coding achieves a tradeoff between bit-rate and quality. However, objective distortion metrics such as mean squared error traditionally used in this framework are poorly correlated with perceptual quality. We address this issue by proposing an approach that incorporates the structural similarity index as a quality metric into the framework. In particular, we develop a predictive Lagrange multiplier estimation method to resolve the chicken and egg dilemma of perceptual-based RDO and apply it to H.264 intra and inter mode decision. Given a perceptual quality level, the resulting video encoder achieves on the average 9% bit-rate reduction for intra-frame coding and 11% for inter-frame coding over the JM reference software. Subjective test further confirms that, at the same bit-rate, the proposed perceptual RDO indeed preserves image details and prevents block artifact better than traditional RDO.Index Terms-H.264, Lagrange multiplier, perceptual quality, rate-distortion optimization, structural similarity index, video codec.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.