2022
DOI: 10.1016/j.jag.2022.102926
|View full text |Cite
|
Sign up to set email alerts
|

Deep learning in multimodal remote sensing data fusion: A comprehensive review

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
34
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 144 publications
(69 citation statements)
references
References 197 publications
0
34
0
1
Order By: Relevance
“…Furthermore, the integration of machine and deep learning algorithms specifically designed for remote sensing applications, such as convolutional neural networks and vision transformers [34,35], can help enhance the performance and capabilities of visual models. These methods can improve the VLM's ability to recognize and analyze complex patterns, structures, and features in remote sensing images, leading to more accurate and reliable results.…”
Section: Discussion: Improving Visual Language Models For Remote Sens...mentioning
confidence: 99%
“…Furthermore, the integration of machine and deep learning algorithms specifically designed for remote sensing applications, such as convolutional neural networks and vision transformers [34,35], can help enhance the performance and capabilities of visual models. These methods can improve the VLM's ability to recognize and analyze complex patterns, structures, and features in remote sensing images, leading to more accurate and reliable results.…”
Section: Discussion: Improving Visual Language Models For Remote Sens...mentioning
confidence: 99%
“…The pixel-wise classification of HSI, which appears as an important issue of HSI processing technology, achieves a phenomenal interest of researchers and has been studied by many scholars in recent years [9,10]. The purpose of the pixelwise classification is to assign a unique category label to each pixel of the HSI dataset.…”
Section: Introductionmentioning
confidence: 99%
“…There has been a trend towards incorporating multiple modalities in aerial object classification to improve performance and robustness. [11][12][13][14] This includes different types of imagery, such as RGB, infrared, hyperspectral, multispectral, synthetic aperture radar (SAR), and light detection and ranging, as well as other data sources such as terrain maps and building footprints. Multimodal approaches have shown promising results, as they leverage complementary information from different sources to provide a more complete and accurate representation of the scene.…”
Section: Introductionmentioning
confidence: 99%
“…Multimodal approaches have shown promising results, as they leverage complementary information from different sources to provide a more complete and accurate representation of the scene. 11,15 However, collecting data from multiple modalities can be challenging and expensive due to the requirement of specialized equipment, atmospheric conditions, limitation of individual modalities to probe a scene, data integration from modalities with different spatial and spectral resolutions, and annotation challenges for obtaining ground truth. 16,17 Researchers often encounter the issue of limited paired multimodal data for training an end-to-end multimodal fusion network, 16 where paired multimodal data samples simultaneously agree with the following conditions: 1) spatial and temporal correspondence and 2) synchronous occurrence/availability.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation