Khaled Koutini scite author profile

Khaled Koutini

5Publications

213Citation Statements Received

50Citation Statements Given

How they've been cited

214

208

How they cite others

Affiliations

Johannes Kepler University of Linz

Publications

Order By: Most citations

The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

Koutini

Eghbal-zadeh

Dorfer

et al. 2019

View full text Add to dashboard Cite

Convolutional Neural Networks (CNNs) have had great success in many machine vision as well as machine audition tasks. Many image recognition network architectures have consequently been adapted for audio processing tasks. However, despite some successes, the performance of many of these did not translate from the image to the audio domain. For example, very deep architectures such as ResNet [1] and DenseNet [2], which significantly outperform VGG [3] in image recognition, do not perform better in audio processing tasks such as Acoustic Scene Classification (ASC). In this paper, we investigate the reasons why such powerful architectures perform worse in ASC compared to simpler models (e.g., VGG). To this end, we analyse the receptive field (RF) of these CNNs and demonstrate the importance of the RF to the generalization capability of the models. Using our receptive field analysis, we adapt both ResNet and DenseNet, achieving state-of-theart performance and eventually outperforming the VGG-based models. We introduce systematic ways of adapting the RF in CNNs, and present results on three data sets that show how changing the RF over the time and frequency dimensions affects a model's performance. Our experimental results show that very small or very large RFs can cause performance degradation, but deep models can be made to generalize well by carefully choosing an appropriate RF size within a certain range.

show abstract

Efficient Training of Audio Transformers with Patchout

Koutini¹,

Schlüter²,

Eghbal-zadeh³

et al. 2022

View full text Add to dashboard Cite

Receptive-Field-Regularized CNN Variants for Acoustic Scene Classification

Koutini¹,

Eghbal-zadeh²,

Widmer³

2019

View full text Add to dashboard Cite

Acoustic scene classification and related tasks have been dominated by Convolutional Neural Networks (CNNs) [2][3][4][5][6][7][8][9][10]. Topperforming CNNs use mainly audio spectograms as input and borrow their architectural design primarily from computer vision. A recent study [1] has shown that restricting the receptive field (RF) of CNNs in appropriate ways is crucial for their performance, robustness and generalization in audio tasks. One side effect of restricting the RF of CNNs is that more frequency information is lost. In this paper, we perform a systematic investigation of different RF configuration for various CNN architectures on the DCASE 2019 Task 1.A dataset. Second, we introduce Frequency Aware CNNs to compensate for the lack of frequency information caused by the restricted RF, and experimentally determine if and in what RF ranges they yield additional improvement. The result of these investigations are several well-performing submissions to different tasks in the DCASE 2019 Challenge.

show abstract

Receptive Field Regularization Techniques for Audio Classification and Tagging With Deep Convolutional Neural Networks

Koutini

Eghbal-zadeh

Widmer

2021

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Efficient Training of Audio Transformers with Patchout

Koutini¹,

Schlüter²,

Eghbal-zadeh³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Khaled Koutini

The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

Efficient Training of Audio Transformers with Patchout

Receptive-Field-Regularized CNN Variants for Acoustic Scene Classification

Receptive Field Regularization Techniques for Audio Classification and Tagging With Deep Convolutional Neural Networks

Efficient Training of Audio Transformers with Patchout

Contact Info

Product

Resources

About