2021
DOI: 10.1109/tgrs.2020.3016820
|View full text |Cite
|
Sign up to set email alerts
|

More Diverse Means Better: Multimodal Deep Learning Meets Remote-Sensing Imagery Classification

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
377
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 942 publications
(380 citation statements)
references
References 54 publications
3
377
0
Order By: Relevance
“…I Nrecent years, with the rapid development of remote sensing technology, remote sensing images' spatial resolution has gradually increased, which makes the demand for automatic interpretation technology of remote sensing images increasing. Land cover classification [37][38][39][40][41][42], scene-level geographic image classification [36], and geospatial object detection and recognition [34,35] are basic and challenging research contents in the automatic interpretation of remote sensing images and have caused a wide range of attention and research.…”
Section: Introductionmentioning
confidence: 99%
“…I Nrecent years, with the rapid development of remote sensing technology, remote sensing images' spatial resolution has gradually increased, which makes the demand for automatic interpretation technology of remote sensing images increasing. Land cover classification [37][38][39][40][41][42], scene-level geographic image classification [36], and geospatial object detection and recognition [34,35] are basic and challenging research contents in the automatic interpretation of remote sensing images and have caused a wide range of attention and research.…”
Section: Introductionmentioning
confidence: 99%
“…8(A), we produced the gradient field and K-means clustering images as two new modalities for extracting shallow features. There are many methods to fuse multimodal inputs, and concatenate-based fusion is an intuitive fusion method [32], but this method is more suitable for situations where each modality is equally important for classification. Extracting features first and then concatenating is also a very popular fusion method [33], while the number of parameters and GPU memory limit its application.…”
Section: Multimodal Generation and Fusionmentioning
confidence: 99%
“…Different from shallow machine learning methods, the deep network structure of deep learning has better feature mining performance. Therefore, many researchers have applied CNN to HSIC and demonstrated that CNN can show exceedingly promising performance [24][25][26][27][28][29][30][31][32][33][34][35][36]. For example, in [24], the spatial and spectral features of the original HSI are simultaneously extracted by 3D-CNN.…”
Section: Introductionmentioning
confidence: 99%