Digitization of how people acquire music calls for better music information retrieval techniques, and dimensional emotion tracking is increasingly seen as an attractive approach. Unfortunately, the majority of models we still use are borrowed from other problems that do not suit emotion prediction well, as most of them tend to ignore the temporal dynamics present in music and/or the continuous nature of ArousalValence space. In this paper we propose the use of Continuous Conditional Random Fields for dimensional emotion tracking and a novel feature vector representation technique. Both approaches result in a substantial improvement on both rootmean-squared error and correlation, for both short and long term measurements. In addition, they can both be easily extended to multimodal approaches to music emotion recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.