2020
DOI: 10.48550/arxiv.2009.14385
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers

Abstract: While significant advances in deep learning has resulted in state-of-the-art performance across a large number of complex visual perception tasks, the widespread deployment of deep neural networks for TinyML applications involving on-device, low-power image recognition remains a big challenge given the complexity of deep neural networks. In this study, we introduce AttendNets, low-precision, highly compact deep neural networks tailored for on-device image recognition. More specifically, AttendNets possess deep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 27 publications
0
13
0
Order By: Relevance
“…Third, it can be observed that a majority of the layers of the TB-Net network architecture design is comprised of visual attention condensers (Wong et al, 2020b ), which are a variant of the highly efficient attention condenser self-attention mechanisms recently introduced in Wong et al ( 2020a ). More specifically, visual attention condensers produce condensed embedding characterizing joint spatial and cross-channel activation relationships and achieves selective attention accordingly to improve representational capability while maintaining very low architectural and computational complexity.…”
Section: Methodsmentioning
confidence: 99%
“…Third, it can be observed that a majority of the layers of the TB-Net network architecture design is comprised of visual attention condensers (Wong et al, 2020b ), which are a variant of the highly efficient attention condenser self-attention mechanisms recently introduced in Wong et al ( 2020a ). More specifically, visual attention condensers produce condensed embedding characterizing joint spatial and cross-channel activation relationships and achieves selective attention accordingly to improve representational capability while maintaining very low architectural and computational complexity.…”
Section: Methodsmentioning
confidence: 99%
“…We take into account several computational and bestpractice constraints which are formulated via the indicator function 1 g (•): i) the macroarchitecture design uses several parallel columns to significantly reduce the architectural and computational complexity with much greater disentanglement of learned features; ii) to reduce the considerable information loss caused by the pointwise strided convolutions used in residual networks [6] and RegNet architecture [10] here we restricted its use from the optimization; iii) antialiasing downsampling (AADS) [24] modules are to be used in the early network stage to improve network stability and robustness; iv) FLOPs within 20% of 100M FLOPs for edge compute scenarios. In the machine driven design exploration process, attention condensers (VAC) [20], [21] are used as an highly efficient self-attention module to learn and produce condensed embedding characterizing the joint local and cross-channel activation relationships. However, the machine-driven design exploration process automatically determines the best way to satisfy the defined constraints in designing the CellDefectNet architecture.…”
Section: Methodsmentioning
confidence: 99%
“…Early-stage self-attention: the VACs are leveraged heavily within the initial modules used in the network architecture. VAC was first introduced by Wong et al[21] for image classification. The VACs can help to better model activation relationships and improves selective attention.…”
mentioning
confidence: 99%
“…The number in each convolution module represents the number of channels. The numbers in each visual attention condenser represents the number of channels for the down-mixing layer, the embedding structure, and the up-mixing layer, respectively (details can be found in [15]). It can be observed that all Cancer-Net SCa architectures exhibit both great macroarchitecture and microarchitecture design diversity, with certain models exhibiting specific lightweight macroarchitecture design characteristics such as attention condenser and projection-expansion-projection-expansion (PEPE) design patterns comprised of channel dimensionality reduction, depthwise convolutions, and pointwise convolutions are highly diverse and heterogeneous, with a mix of spatial convolutions, pointwise convolutions, and depthwise convolutions, all with different microarchitecture designs.…”
Section: Diverse Heterogeneous Designsmentioning
confidence: 99%
“…The use of computer vision and machine learning for the diagnosis of pigmented skin lesions has been shown to be accurate and practical [5][6][7][9][10][11][12][13], and can improve biopsy decision making [10], as well as act as a pre-screening tool to reduce the amount of a time a professional spends on each case. Motivated by the challenge of skin cancer detection, and inspired by the open source and open access efforts of the research community, in this study we introduce Cancer-Net SCa, a suite of deep neural network designs tailored for the detection of skin cancer from dermoscopy images, one of which possesses a self-attention architecture design with attention condensers [14,15]. To construct Cancer-Net SCa, we leveraged a machine-driven design strategy that leverages human experience and ingenuity with the meticulousness and raw speed of machines.…”
Section: Introductionmentioning
confidence: 99%