2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.195
|View full text |Cite
|
Sign up to set email alerts
|

Xception: Deep Learning with Depthwise Separable Convolutions

Abstract: We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Incept… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

22
6,492
0
79

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 12,620 publications
(7,525 citation statements)
references
References 17 publications
22
6,492
0
79
Order By: Relevance
“…This significantly decreases the number of parameters since the fully connected layers include a large number of parameters. Thus, this network is able to learn deeper representations of features with fewer parameters relative to AlexNet while it is much faster than VGG [31]. Figure 2 illustrates a compressed view of InceptionV3 employed in this study.…”
Section: Resnetmentioning
confidence: 99%
See 1 more Smart Citation
“…This significantly decreases the number of parameters since the fully connected layers include a large number of parameters. Thus, this network is able to learn deeper representations of features with fewer parameters relative to AlexNet while it is much faster than VGG [31]. Figure 2 illustrates a compressed view of InceptionV3 employed in this study.…”
Section: Resnetmentioning
confidence: 99%
“…Xception network is similar to inception (GoogLeNet), wherein the inception module has been substituted with depth-wise separable convolutional layers [31]. Specifically, Xception's architecture is constructed based on a linear stack of a depth-wise separable convolution layer (i.e., 36 convolutional layers) with linear residual connections (see Figure 4).…”
Section: Xceptionmentioning
confidence: 99%
“…Alternatively, one N×N convolution can be decomposed into two 1-D convolutions, one 1×N and one N×1 convolution [53]; this basically imposes a restriction that the 2-D filter must be separable, which is a common constraint in image processing [151]. Similarly, a 3-D convolution can be replaced by a set of 2-D convolutions (i.e., applied only on one of the input channels) followed by 1×1 3-D convolutions as demonstrated in Xception [152] and MobileNets [153]. The order of the 2-D convolutions and 1×1 3-D convolutions can be switched.…”
Section: Xmentioning
confidence: 99%
“…Based on this premise, Chollet (2016) proposed a convolution performed independently over each channel of an input, followed by a pointwise convolution (i.e. a 1 × 1 convolution) projecting the channels output by depthwise convolution onto a new channel space.…”
Section: Xceptionmentioning
confidence: 99%
“…Xception stands for Extreme Inception and is the name of the architecture proposed by Chollet (2016).…”
Section: Xceptionmentioning
confidence: 99%