2024
DOI: 10.1109/taslp.2024.3376984
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Convolutional Neural Networks as Efficient Pre-Trained Audio Models

Florian Schmid,
Khaled Koutini,
Gerhard Widmer

Abstract: The introduction of large-scale audio datasets, such as AudioSet, paved the way for Transformers to conquer the audio domain and replace CNNs as the state-of-the-art neural network architecture for many tasks. Audio Spectrogram Transformers are excellent at exploiting large datasets, creating powerful pre-trained models that surpass CNNs when fine-tuned on downstream tasks. However, current popular Audio Spectrogram Transformers are demanding in terms of computational complexity compared to CNNs. Recently, we … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 55 publications
(156 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?