2022
DOI: 10.1155/2022/7314599
|View full text |Cite
|
Sign up to set email alerts
|

A Cross-Modal Image and Text Retrieval Method Based on Efficient Feature Extraction and Interactive Learning CAE

Abstract: In view of the complexity of the multimodal environment and the existing shallow network structure that cannot achieve high-precision image and text retrieval, a cross-modal image and text retrieval method combining efficient feature extraction and interactive learning convolutional autoencoder (CAE) is proposed. First, the residual network convolution kernel is improved by incorporating two-dimensional principal component analysis (2DPCA) to extract image features and extracting text features through long sho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 37 publications
(57 reference statements)
0
3
0
Order By: Relevance
“…erefore, teachers need to actively guide students through classroom instruction, use motivating and effective assessments (individual classroom extra credit, group assessment, and music grade percentages), and provide opportunities for students to demonstrate their musical skills in order to create a strong desire to learn creative writing. By constructing an "interactive" learning model, it is clear that students' thinking skills are enhanced and their individual musical perception, participation, confidence, and expression are strengthened [27][28][29][30][31][32][33][34][35]. In terms of knowledge, experience, personality, spirit, culture, etc., the vision is enhanced, life is empathized and experienced, life is enlightened, and spirituality is enriched, and the direction of development of teachers and students tends to the realm of truth and beauty.…”
Section: Encourage Students To Dare To Create Through Learningmentioning
confidence: 99%
“…erefore, teachers need to actively guide students through classroom instruction, use motivating and effective assessments (individual classroom extra credit, group assessment, and music grade percentages), and provide opportunities for students to demonstrate their musical skills in order to create a strong desire to learn creative writing. By constructing an "interactive" learning model, it is clear that students' thinking skills are enhanced and their individual musical perception, participation, confidence, and expression are strengthened [27][28][29][30][31][32][33][34][35]. In terms of knowledge, experience, personality, spirit, culture, etc., the vision is enhanced, life is empathized and experienced, life is enlightened, and spirituality is enriched, and the direction of development of teachers and students tends to the realm of truth and beauty.…”
Section: Encourage Students To Dare To Create Through Learningmentioning
confidence: 99%
“…Fang et al [26] introduced an innovative autoencoder network that explores semantic disparities between visual representations and text through reconstruction constraints across modalities. Additionally, Yin et al's [27] Convolutional Auto-Encoder (CAE) model establishes meaningful correlations among high-level semantic relationships to enhance accuracy in image-text retrieval within multimodal environments. Meanwhile, Shumpei Miyawaki et al's [28] dual-encoder model integrates image visual and text semantics into a shared semantic space for efficient offline inference.…”
Section: Auto-encodermentioning
confidence: 99%
“…The development of cultural tourism in the city can be further promoted by it, and cross-cultural communication of city tourism can be innovated. 1) Tourism Text Feature Extraction TF-IDF, i.e., Word Frequency-Inverse Document Frequency, is a statistically based method for calculating word weights, a common method for feature vectorization, which is widely used in the fields of information retrieval, data mining, etc [31]. This method is used to assess the importance of a word in this document for distinguishing other documents in the corpus, i.e., if the more times the word appears in this document and the fewer times it appears in other documents, it means that the word has a stronger distinguishing ability for this document, and its weight value is larger.…”
Section: Extraction Of Multimodal Discourse Featuresmentioning
confidence: 99%