2020
DOI: 10.1109/tip.2020.2975718
|View full text |Cite
|
Sign up to set email alerts
|

A Multiple-Instance Densely-Connected ConvNet for Aerial Scene Classification

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
96
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 116 publications
(96 citation statements)
references
References 59 publications
0
96
0
Order By: Relevance
“…Training-to-test set ratio 50% 80% PLSA(SIFT) [10] 67.55±1.11 71.38±1.77 BoVW(SIFT) [10] 73.48±1.39 75.52±2.13 AlexNet [10] 93.98±0.67 95.02±0.81 VGGNet-16 [10] 94.14±0.69 95.21±1.20 GoogLeNet [10] 92.70±0.60 94.31±0.89 CaffeNet [10] 93.98±0.67 95.02±0.81 TEX-Net with VGG [41] 94.22±0.50 95.31±0.69 D-CNN with AlexNet [13] --96.67±0.10 Fine-tuned GoogLeNet [37] --97.1 Two-Stream Fusion [26] 96.97±0.75 98.02±1.03 SPP with AlexNet [19] 94.77±0.46 96.67±0.94 Gated attention [64] 94.64±0.43 96.12±0.42 CCP-net [65] --97.52±0.97 Fusion by addition [20] --97.42±1.79 DSFATN [61] --98.25 Deep CNN Transfer [22] --98.49 MIDC-Net [66] 95.41±0.40 97.40±0.48 DFAGCN [44] --98.48±0.42 Inception-v3-CapsNet [34] 97.59±0.16 99.05±0.24 Backbone (Xception) [47] 92.76±0. 2) The improvements of the CSDS model are prominent for the large (80%) than for the small training-to-test set ratio (50%).…”
Section: Methodsmentioning
confidence: 99%
“…Training-to-test set ratio 50% 80% PLSA(SIFT) [10] 67.55±1.11 71.38±1.77 BoVW(SIFT) [10] 73.48±1.39 75.52±2.13 AlexNet [10] 93.98±0.67 95.02±0.81 VGGNet-16 [10] 94.14±0.69 95.21±1.20 GoogLeNet [10] 92.70±0.60 94.31±0.89 CaffeNet [10] 93.98±0.67 95.02±0.81 TEX-Net with VGG [41] 94.22±0.50 95.31±0.69 D-CNN with AlexNet [13] --96.67±0.10 Fine-tuned GoogLeNet [37] --97.1 Two-Stream Fusion [26] 96.97±0.75 98.02±1.03 SPP with AlexNet [19] 94.77±0.46 96.67±0.94 Gated attention [64] 94.64±0.43 96.12±0.42 CCP-net [65] --97.52±0.97 Fusion by addition [20] --97.42±1.79 DSFATN [61] --98.25 Deep CNN Transfer [22] --98.49 MIDC-Net [66] 95.41±0.40 97.40±0.48 DFAGCN [44] --98.48±0.42 Inception-v3-CapsNet [34] 97.59±0.16 99.05±0.24 Backbone (Xception) [47] 92.76±0. 2) The improvements of the CSDS model are prominent for the large (80%) than for the small training-to-test set ratio (50%).…”
Section: Methodsmentioning
confidence: 99%
“…A popular trend of deep learning algorithms in single-scene classification is to take a CNN as the backbone and introduce well-designed modules for further enhancing the feature efficiency. For instance, Bi et al [31] proposed to learn multiple instances from feature maps extracted by a densely connected CNN and integrated them into bag-level features for single-scene classification. Li et al [49] proposed a key region capturing method to learn class-specific features and retain global information for inferring scene labels.…”
Section: A Aerial Single-scene Classificationmentioning
confidence: 99%
“…Other approaches based on recurrent neural networks (RNNs) [23], generative adversarial networks (GANs) [24,25], graph convolutional networks (GCNs) [26], and long-short-term memory (LSTM) [27] have been introduced also. In a recent contribution, the authors considered remote-sensing scene classification as a multiple-instance learning (MIL) problem [28]. They proposed a multiple-instance densely connected network to highlight the local semantics relevant to the scene label.…”
Section: Introductionmentioning
confidence: 99%