2020
DOI: 10.1007/978-3-030-58520-4_4
|View full text |Cite
|
Sign up to set email alerts
|

Tensor Low-Rank Reconstruction for Semantic Segmentation

Abstract: Context information plays an indispensable role in the success of semantic segmentation. Recently, non-local self-attention based methods are proved to be effective for context information collection. Since the desired context consists of spatial-wise and channel-wise attentions, 3D representation is an appropriate formulation. However, these non-local methods describe 3D context information based on a 2D similarity matrix, where space compression may lead to channel-wise attention missing. An alternative is t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
31
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 60 publications
(33 citation statements)
references
References 51 publications
0
31
0
Order By: Relevance
“…The Feature Pyramid Transformer (FPT) [11] is a completely active feature interaction that extends the receptive field through the specified Transformer. Chen et al [24] developed a tensor generation module to capture contextual data and offered a new way to modeling 3D context representations. The spatial relation module and the channel relation module were introduced by Mou et al [25] to learn and infer the global link between any two spatial positions or feature maps, and then build a feature representation with improved relationship.…”
Section: B Contextual Informationmentioning
confidence: 99%
See 1 more Smart Citation
“…The Feature Pyramid Transformer (FPT) [11] is a completely active feature interaction that extends the receptive field through the specified Transformer. Chen et al [24] developed a tensor generation module to capture contextual data and offered a new way to modeling 3D context representations. The spatial relation module and the channel relation module were introduced by Mou et al [25] to learn and infer the global link between any two spatial positions or feature maps, and then build a feature representation with improved relationship.…”
Section: B Contextual Informationmentioning
confidence: 99%
“…We used the ground truth whose boundaries of objects have not been eroded by 3-pixel radius for testing. According to the official division principle, 17 patches were used as the test set (image id: 1, 3,5,7,11,13,15,17,20,21,23,26,28,30,32,34,37), and the other 16 as the training set (image id: 2, 4, 6, 8, 10, 12,14,16,20,22,24,27,29,31,33,35,38). For the large image, we cut it into 256×256 slices.…”
Section: Dataset Description and Design Of Experimentsmentioning
confidence: 99%
“…However, high-rank tensor representation needs a huge cost. Inspired by [19], we design a TLRR to deal with the above problem. This block consists of two 1 × 1 Conv layers, a low-rank tensor generation module (TGM) and a high-rank tensor reconstruction module (TRM), as shown in Fig.…”
Section: Tensor Low-rank Reconstruction Blockmentioning
confidence: 99%
“…It can be observed that the spatial-spectral information is seldom considered simultaneously during their attention features extraction phase in these methods. Inspired by [19], we propose a novel cooperative spatial-spectral attention network using tensor low-rank reconstruction, which is able to model a 3D attention map tensor considering the long-range dependencies of spatial land covers and spectral bands. In addition, spatial-spectral attention features are reconstructed to assist the network in enhancing the discriminative spatialspectral features.…”
Section: Introductionmentioning
confidence: 99%
“…These visualization results further demonstrate that our proposed module can capture and encode the spatial similarities into the channel attention map to achieve full attentions. (Zhang et al 2018) ResNet-101 85.9 DFN (Yu et al 2018) ResNet-101 86.2 CFNet (Zhang, Wang, and Xie 2019) ResNet-101 87.2 EMANet ResNet-101 87.7 DeeplabV3+ (Chen et al 2018) Xception+JFT 89.0 RecoNet (Chen et al 2020) ResNet-101 88.5 FLANet (Ours) ‡…”
Section: Visualization Of Attention Modulementioning
confidence: 99%