2022
DOI: 10.1109/lgrs.2022.3192062
|View full text |Cite
|
Sign up to set email alerts
|

TypeFormer: Multiscale Transformer With Type Controller for Remote Sensing Image Caption

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 19 publications
0
8
0
Order By: Relevance
“…Firstly, we conduct the performance experiments of the accounting feature extraction method based on multimodal information embedding on the Synthetic Financial Dataset. We selected CNN ( Kattenborn et al, 2021 ), LSTM ( Yu et al, 2019 ), Transformer ( Vaswani et al, 2017 ), BERT ( Deepa, 2021 ) and TypeFormer ( Chen et al, 2022 ) to compare the performance in Fig. 6 and Table 3 .…”
Section: Experiments and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Firstly, we conduct the performance experiments of the accounting feature extraction method based on multimodal information embedding on the Synthetic Financial Dataset. We selected CNN ( Kattenborn et al, 2021 ), LSTM ( Yu et al, 2019 ), Transformer ( Vaswani et al, 2017 ), BERT ( Deepa, 2021 ) and TypeFormer ( Chen et al, 2022 ) to compare the performance in Fig. 6 and Table 3 .…”
Section: Experiments and Analysismentioning
confidence: 99%
“…Then, we test the performance of multi-objective parameter selection based on a parallel genetic algorithm on the dataset. MPOS will still be compared with CNN ( Kattenborn et al, 2021 ), LSTM ( Yu et al, 2019 ), Transformer ( Vaswani et al, 2017 ), BERT ( Deepa, 2021 ), and TypeFormer ( Chen et al, 2022 ). In this experiment, the model will be evaluated in terms of accuracy, number of parameters, and elapsed time.…”
Section: Experiments and Analysismentioning
confidence: 99%
“…Transformer has been successfully applied to application research of remote sensing images, providing a new idea to solve the problems of insufficient robustness faced by subjectsensitive hashing. Chen et al [33] employed a pure multiscale transformer for captioning of remote sensing image, which can effectively generate specific types of captions. Zhang et al [34] build dual stream network (DTHNet) based on transformer for shadow extraction of remote sensing images.…”
Section: Transformersmentioning
confidence: 99%
“…Language integration in RS has showcased impressive capabilities across various tasks, including image captioning [2,[17][18][19][20][21][22][23][24][25][26][27][28], VQA [3,[29][30][31][32], and text-image retrieval [4]. A comprehensive review of NLP applications in RS can be found at [1].…”
Section: Nlp In Remote Sensingmentioning
confidence: 99%
“…In [26], multi-scale visual features are extracted by a CNN, which are decoded using a language transformer. Another proposed approach incorporates the caption type into the caption features within an encoder-decoder based on the transformer, enabling the generation of more controlled captions [27]. In [28], visual features extracted by a CNN are fed into a transformer encoder-decoder trained with a self-critical sequence strategy.…”
Section: Nlp In Remote Sensingmentioning
confidence: 99%