Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413776
|View full text |Cite
|
Sign up to set email alerts
|

Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space

Abstract: Both images and music can convey rich semantics and are widely used to induce specific emotions. Matching images and music with similar emotions might help to make emotion perceptions more vivid and stronger. Existing emotion-based image and music matching methods either employ limited categorical emotion states which cannot well reflect the complexity and subtlety of emotions, or train the matching model using an impractical multi-stage pipeline.In this paper, we study end-to-end matching between image and mu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 21 publications
(8 citation statements)
references
References 77 publications
0
7
0
Order By: Relevance
“…Similarly, in [4] the music mood is denoted as the Valence-Arousal value as 2D vectors, which is then mapped to a specific RGB value on a color wheel. In [5], [6], and [7], the moods of both music and images are extracted and compared. Therefore, they allow for the selection of the most relavant music-image pairs within a finite library of music and images.…”
Section: Music Mood Visualizationmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly, in [4] the music mood is denoted as the Valence-Arousal value as 2D vectors, which is then mapped to a specific RGB value on a color wheel. In [5], [6], and [7], the moods of both music and images are extracted and compared. Therefore, they allow for the selection of the most relavant music-image pairs within a finite library of music and images.…”
Section: Music Mood Visualizationmentioning
confidence: 99%
“…Inspired by this encoder-decoder-based structure, we also provided a similar solution in [9]. We re-implemented the image-music mapping model proposed in [6] to construct a dataset with corresponding music and landscape pairs. Following that, we implemented a similar encoder-decoder structure with ResNet50 [13] and StyleGAN3 [14] pre-trained on a landscape dataset [15].…”
Section: Music Mood Visualizationmentioning
confidence: 99%
“…The system can achieve the retrieval between Chinese folk music and Chinese folk image based on their involved emotions. Chen et al [172] and Zhao et al [173] designed a system that computes the emotional similarity between music and images. With this system, users can generate the mood-aware music slide shows from their personal album photos.…”
Section: Entertainment Assistantmentioning
confidence: 99%
“…Matching images and music with similar emotions might help to make emotion perceptions more vivid and stronger. [28] There is a team which propose to musicalize images based on their emotions. The extract visual features inspired by the concept of principles-of-art can recognize image emotions.…”
Section: Photo and Music With Emotionmentioning
confidence: 99%