2020
DOI: 10.1007/978-3-030-58539-6_11
|View full text |Cite
|
Sign up to set email alerts
|

Object-Contextual Representations for Semantic Segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
550
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 794 publications
(567 citation statements)
references
References 63 publications
1
550
0
Order By: Relevance
“…DenseASPP [34] combines a dense skip connection with ASPP, which effectively enlarges the receptive field size of the network. Recently, inspired by the success of the attention mechanism in natural language processing, the self-attention mechanism has also been applied to aggregate the dense pixel-wise context [18,[35][36][37]. The major drawback of self-attention is that it has excessive computation and memory consumption.…”
Section: Aggregation Of the Multi-scale Contextmentioning
confidence: 99%
See 2 more Smart Citations
“…DenseASPP [34] combines a dense skip connection with ASPP, which effectively enlarges the receptive field size of the network. Recently, inspired by the success of the attention mechanism in natural language processing, the self-attention mechanism has also been applied to aggregate the dense pixel-wise context [18,[35][36][37]. The major drawback of self-attention is that it has excessive computation and memory consumption.…”
Section: Aggregation Of the Multi-scale Contextmentioning
confidence: 99%
“…Evidently, this way increases the complexity and the parameter amount of the model. The others focus on exploiting the online hard example mining strategy [29,37] and perceptual loss [41], which both require careful re-training or fine-tuning of the hyperparameters.…”
Section: Boundary Refinementmentioning
confidence: 99%
See 1 more Smart Citation
“…The space attention module can aggregate different features at different positions, and the channel attention module can integrate the correlation features between different channels to yield more precise results. OCNet [ 34 ] employs the strategy of aggregating contextual information using ground truth values to supervise the learning of target areas, uses the corresponding object context representation to characterize the pixels, and then calculates the relationship between each pixel and each target area via object-contextual representation to extend the representation of each pixel. CCNet [ 35 ] proposed a new crisscross attention model to obtain the context information of nearby pixels.…”
Section: Related Workmentioning
confidence: 99%
“…This holds true for semantic segmentation (i.e., the pixel-wise labeling of input images). CNN-based algorithms are the top performing solutions for the PASCAL VOC 2012 [ 12 , 13 , 14 ] dataset, cityscapes [ 15 , 16 , 17 ], and ADE20K [ 10 , 18 , 19 ]. There have also been multiple proposals for using CNNs to analyze endoscopic camera images, predominantly in the medical field [ 20 , 21 , 22 ].…”
Section: Introductionmentioning
confidence: 99%