ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053690
|View full text |Cite
|
Sign up to set email alerts
|

Counting Dense Objects in Remote Sensing Images

Abstract: Estimating accurate number of interested objects from a given image is a challenging yet important task. Significant efforts have been made to address this problem and achieve great progress, yet counting number of ground objects from remote sensing images is barely studied. In this paper, we are interested in counting dense objects from remote sensing images. Compared with object counting in natural scene, this task is challenging in following factors: large scale variation, complex cluttered background and o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
19
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 19 publications
(19 citation statements)
references
References 28 publications
0
19
0
Order By: Relevance
“…The truncated VGG-16 is composed of convolutional layers with a fixed kernel size of 3 × 3 that extracts discriminative features from input image for further analysis of the network. We use the VGG-16 network backbone in our network mainly because of its good generalization ability to other vision tasks such as counting and object detection (Liu et al, 2016; Gao et al, 2020b; Liu et al, 2019; Li et al, 2018). The truncated VGG-16 network includes all layers of VGG-16 network except the last max-pooling layer and all fully connected layer.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The truncated VGG-16 is composed of convolutional layers with a fixed kernel size of 3 × 3 that extracts discriminative features from input image for further analysis of the network. We use the VGG-16 network backbone in our network mainly because of its good generalization ability to other vision tasks such as counting and object detection (Liu et al, 2016; Gao et al, 2020b; Liu et al, 2019; Li et al, 2018). The truncated VGG-16 network includes all layers of VGG-16 network except the last max-pooling layer and all fully connected layer.…”
Section: Methodsmentioning
confidence: 99%
“…Euclidean loss measures estimation error at pixel level and has been used in other crowd counting studies (Boominathan et al, 2016; Gao et al, 2020b; Lian et al, 2019).…”
Section: Methodsmentioning
confidence: 99%
“…Our proposed network uses VGG-16 (Simonyan and Zisserman, 2014) as a backbone for feature extraction. Originally proposed for image classification, the VGG-16 network stacks convolutional layers with a fixed kernel size of 3 3, which usually generalizes well to other vision tasks including object counting and detection (Shi et al, 2018; Boominathan et al, 2016; Gao et al, 2020b; Sang et al, 2019; Valloli and Mehta, 2019; Liu et al, 2016; Kumar et al, 2019). We exclude the last max-pooling layer and all fully connected from the VGG network.…”
Section: Methodsmentioning
confidence: 99%
“…We employ the Euclidean loss as shown in Equation 1 to train the network. The Euclidean loss is a popular loss in crowd counting literature due to enhancing the quality of the estimated density map (Gao et al, 2020b; Boominathan et al, 2016; Shi et al, 2018; Wang et al, 2019); where, F ( X i , Θ), θ, X i , D i , and N denote the predicted density map of the i th input image, the parameters of the network, the i th input image, the i th ground truth density map, and the number of images, respectively. Euclidean loss measures the distance between the estimated density map and the ground truth.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation