2019
DOI: 10.1109/access.2019.2919802
|View full text |Cite
|
Sign up to set email alerts
|

X-Net: A Binocular Summation Network for Foreground Segmentation

Abstract: In foreground segmentation, it is challenging to construct an effective background model to learn the spatial-temporal representation of the background. Recently, deep learning-based background models (DBMs) with the capability of extracting high-level features have shown remarkable performance. However, the existing state-of-the-art DBMs deal with video segmentation as single-image segmentation and ignore temporal cues in video sequences. To exploit temporal data sufficiently, this paper proposes a multi-inpu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
21
0
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(23 citation statements)
references
References 41 publications
0
21
0
2
Order By: Relevance
“…1) Pretraining and Finetuning: To take advantage of the foundational CNN architectures trained over large-scale image datasets, several studies have proposed the use of pretrained blocks or layers to enhance the representation capability of the CD models. The feature learning capability of off-theshelf CNN models such as VGG16, ResNet50, GoogleNet, DeepLab and ResNet18 have been successfully adapted for change detection in [69], [71], [74], [81], [83], [84], [89]- [91], [136]- [142]. Chen et al [69] designed an attention ConvL-STM to model pixel-wise changes over time.…”
Section: B Deep Learning Based Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…1) Pretraining and Finetuning: To take advantage of the foundational CNN architectures trained over large-scale image datasets, several studies have proposed the use of pretrained blocks or layers to enhance the representation capability of the CD models. The feature learning capability of off-theshelf CNN models such as VGG16, ResNet50, GoogleNet, DeepLab and ResNet18 have been successfully adapted for change detection in [69], [71], [74], [81], [83], [84], [89]- [91], [136]- [142]. Chen et al [69] designed an attention ConvL-STM to model pixel-wise changes over time.…”
Section: B Deep Learning Based Methodsmentioning
confidence: 99%
“…Thus, it contains both the categorylevel semantics and finegrained details. Similarly, Zhang et al [136] augment a U-Net shaped architecture to detect the pixel-level change. The pretrained weights of VGG16 has been most widely used due to its simplicity and flexibility to alter the intermediate layers for enhance the encoder-decoder for change detection [71], [83], [84], [89]- [91], [136], [137], [139], [140].…”
Section: B Deep Learning Based Methodsmentioning
confidence: 99%
See 3 more Smart Citations