ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414337
|View full text |Cite
|
Sign up to set email alerts
|

Neural Audio Fingerprint for High-Specific Audio Retrieval Based on Contrastive Learning

Abstract: Most of existing audio fingerprinting systems have limitations to be used for high-specific audio retrieval at scale. In this work, we generate a low-dimensional representation from a short unit segment of audio, and couple this fingerprint with a fast maximum inner-product search. To this end, we present a contrastive learning framework that derives from the segment-level search objective. Each update in training uses a batch consisting of a set of pseudo labels, randomly selected original samples, and their … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(5 citation statements)
references
References 15 publications
(27 reference statements)
0
5
0
Order By: Relevance
“…Furthermore, our system performs a comprehensive search to precisely estimate the timestamp of the query in the identified reference audio using a simple sequence search strategy, which makes our system applicable to audio synchronization tasks. Our system performs well compared to the baseline [10] and Audfprint 4 methods. Also, our system is computationally and memoryefficient due to the compact embeddings that make our system deployable on an extensive database.…”
Section: Discussionmentioning
confidence: 91%
See 2 more Smart Citations
“…Furthermore, our system performs a comprehensive search to precisely estimate the timestamp of the query in the identified reference audio using a simple sequence search strategy, which makes our system applicable to audio synchronization tasks. Our system performs well compared to the baseline [10] and Audfprint 4 methods. Also, our system is computationally and memoryefficient due to the compact embeddings that make our system deployable on an extensive database.…”
Section: Discussionmentioning
confidence: 91%
“…We compared our approach with a baseline system [10] that generates fingerprints using an encoder similar to Now-Playing's [9] architecture and does a comprehensive search. To the best of our knowledge, it is the only method that employs a neural network model to generate robust We trained the models with the Adam [21] optimizer for 150 epochs using the cyclic learning rate.…”
Section: Implementation Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…Alternative systems with available implementations are by Chang et al (2021) and audfprint by Ellis (2014). Both systems however lack robustness to significant speed changes of more than 5%.…”
Section: Statement Of Needmentioning
confidence: 99%
“…Especially in applications made with ANN (Artificial Neural Network), very high success rates are achieved. Studies numbered [20][21][22][23] are some important studies carried out in this field.…”
Section: Introductionmentioning
confidence: 99%