2016
DOI: 10.48550/arxiv.1606.02147
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

Abstract: The ability to perform pixel-wise semantic segmentation in real-time is of paramount importance in mobile applications. Recent deep neural networks aimed at this task have the disadvantage of requiring a large number of floating point operations and have long run-times that hinder their usability. In this paper, we propose a novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation. ENet is up to 18× faster, requires 75× less FL… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
688
1
3

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 467 publications
(771 citation statements)
references
References 30 publications
1
688
1
3
Order By: Relevance
“…For the details of our bottleneck, see Section 3.2. Inspired by ENet [37], standard SE-ResNeXt blocks and SE-ResNeXt blocks with dilated convolution are connected in series to form our backbone. See Table 1 for the detailed architecture of the network.…”
Section: Se-resnext Blockmentioning
confidence: 99%
“…For the details of our bottleneck, see Section 3.2. Inspired by ENet [37], standard SE-ResNeXt blocks and SE-ResNeXt blocks with dilated convolution are connected in series to form our backbone. See Table 1 for the detailed architecture of the network.…”
Section: Se-resnext Blockmentioning
confidence: 99%
“…Image semantic segmentation is an active research field that has seen significant progress since the pioneering work applying fully convolutional networks for the task [6]. Subsequent methods have focused on high quality [7] [8] [9] [10] [11] [12] and/or efficient [13] [14] [15] [16] design choices. More recently, the design of models for video semantic segmentation has received increasing attention.…”
Section: A Image and Video Semantic Segmentationmentioning
confidence: 99%
“…The last layer of the decoder is the softmax layer, which is used to classify pixels. The decoder of RailNet has trained to output binary segmentation maps, indicating which pixels belong to a rail line or not [23].…”
Section: Decodermentioning
confidence: 99%
“…As mentioned in the previous section, the RailNet outputs a set of pixels for the rail lines. It is not ideal to fit polynomials by these pixels in the original image space, so people have to resort to higher-order polynomials to deal with curved rail lines [23]. A generally accepted solution to this problem is to project the image into a "bird's eye" representation, where the rail lines are parallel to each other, so curved rail lines can be fitted with second to third-order polynomials.…”
Section: The Rail Line Fitting Algorithm Based On Sliding Window Dete...mentioning
confidence: 99%