2023
DOI: 10.1609/aaai.v37i6.25923
|View full text |Cite
|
Sign up to set email alerts
|

EffConv: Efficient Learning of Kernel Sizes for Convolution Layers of CNNs

Abstract: Determining kernel sizes of a CNN model is a crucial and non-trivial design choice and significantly impacts its performance. The majority of kernel size design methods rely on complex heuristic tricks or leverage neural architecture search that requires extreme computational resources. Thus, learning kernel sizes, using methods such as modeling kernels as a combination of basis functions, jointly with the model weights has been proposed as a workaround. However, previous methods cannot achieve satisfactory re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 28 publications
(43 reference statements)
0
1
0
Order By: Relevance
“…Convolutional neural networks (CNN) have been the state-of-the-art solutions for computer vision tasks for almost a decade. In the last few years, numerous approaches on the advancement of CNNs were proposed: introduction of skip connections He et al (2016); Huang et al (2017), experimentation with model hyperparameters such as kernel size Ganjdanesh et al (2023), normalisation strategies Ioffe and Szegedy (2015) and activation functions Dubey et al (2022); Apicella et al (2021), depthwise convolutions Howard et al (2017), and model’s block architecture Sandler et al (2018).…”
Section: Introductionmentioning
confidence: 99%
“…Convolutional neural networks (CNN) have been the state-of-the-art solutions for computer vision tasks for almost a decade. In the last few years, numerous approaches on the advancement of CNNs were proposed: introduction of skip connections He et al (2016); Huang et al (2017), experimentation with model hyperparameters such as kernel size Ganjdanesh et al (2023), normalisation strategies Ioffe and Szegedy (2015) and activation functions Dubey et al (2022); Apicella et al (2021), depthwise convolutions Howard et al (2017), and model’s block architecture Sandler et al (2018).…”
Section: Introductionmentioning
confidence: 99%