Proceedings of the Ninth ACM International Conference on Multimedia 2001
DOI: 10.1145/500141.500201
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical filtering method for content-based music retrieval via acoustic input

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
13
0

Year Published

2004
2004
2023
2023

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 63 publications
(13 citation statements)
references
References 12 publications
0
13
0
Order By: Relevance
“…, n. These two vectors are not necessarily of the same size, and we can apply DTW to match each point of the test vector to that of the reference vector in an optimal way. That is, we want to construct an m ϫ n DTW table D(i, j) according to dynamic programming and then identify the optimal mapping (or path) from each point of the test vector t to that of the reference vector r. The exact formula of DTW for our MIRAI engine can be found in Jang and Kao (2000) and Jang and Lee (2001a) and will not be repeated here. Note that when we construct a DTW table, we can force the first point of t to match the first point of r. This case of "match beginning" represents the situation that the user sings/hums from the beginning of a song.…”
Section: Query By Singing/hummingmentioning
confidence: 99%
See 2 more Smart Citations
“…, n. These two vectors are not necessarily of the same size, and we can apply DTW to match each point of the test vector to that of the reference vector in an optimal way. That is, we want to construct an m ϫ n DTW table D(i, j) according to dynamic programming and then identify the optimal mapping (or path) from each point of the test vector t to that of the reference vector r. The exact formula of DTW for our MIRAI engine can be found in Jang and Kao (2000) and Jang and Lee (2001a) and will not be repeated here. Note that when we construct a DTW table, we can force the first point of t to match the first point of r. This case of "match beginning" represents the situation that the user sings/hums from the beginning of a song.…”
Section: Query By Singing/hummingmentioning
confidence: 99%
“…We have tried the difference operator on the pitch vector and found that the operator tends to amplify noises and deteriorate the system's performance. Thus we employ a heuristic to shift the input pitch vector 5 times to achieve a minimum DTW distance when comparing with a candidate song (Jang & Lee, 2001a). The system then returns a ranked song list according to the computed similarity scores.…”
Section: Query By Singing/hummingmentioning
confidence: 99%
See 1 more Smart Citation
“…In this task, note-based and frame-based similarity measures are two commonly used methods. Jang proposed a frame-based template matching strategy by calculating time series similarity with high precision [2], but this method is very time-consuming when the template database growing larger. Typke used transportation distance (EMD), which is a variation of note-based measurement, to achieve satisfying retrieval speed comparing to the frame-based method but loss of precision in some degrees [3].…”
Section: Introductionmentioning
confidence: 99%
“…Even professional singers do not necessarily present error-free queries to MIR systems [31]- [33], [37], because they may not always recall the theme perfectly. To handle such errors, various approximate matching methods, such as dynamic time warping (DTW) [5], [6], [13], [23], [24], the hidden Markov model [15], [31], and the N-gram model [11], [14], have been developed, with DTW being the most popular. However, due to the considerable computational time required for DTW, several speed-up methods have been proposed [5], [23], [24], [31], so that a large-scale music database can be searched more efficiently.…”
mentioning
confidence: 99%