Lirong Yin scite author profile

With the development of artificial intelligence, more and more people hope that computers can understand human language through natural language technology, learn to think like human beings, and finally replace human beings to complete the highly difficult tasks with cognitive ability. As the key technology of natural language understanding, sentence representation reasoning technology mainly focuses on the sentence representation method and the reasoning model. Although the performance has been improved, there are still some problems such as incomplete sentence semantic expression, lack of depth of reasoning model, and lack of interpretability of the reasoning process. In this paper, a multi-layer semantic representation network is designed for sentence representation. The multi-attention mechanism obtains the semantic information of different levels of a sentence. The word order information of the sentence is also integrated by adding the relative position mask between words to reduce the uncertainty caused by word order. Finally, the method is verified on the task of text implication recognition and emotion classification. The experimental results show that the multi-layer semantic representation network can promote sentence representation’s accuracy and comprehensiveness.

show abstract

Improving Visual Reasoning Through Semantic Representation

Zheng

Liu

et al. 2021

IEEE Access

129

View full text Add to dashboard Cite

In visual reasoning, the achievement of deep learning significantly improved the accuracy of results. Image features are primarily used as input to get answers. However, the image features are too redundant to learn accurate characterizations within a limited complexity and time. While in the process of human reasoning, abstract description of an image is usually to avoid irrelevant details. Inspired by this, a higher-level representation named semantic representation is introduced. In this paper, a detailed visual reasoning model is proposed. This new model contains an image understanding model based on semantic representation, feature extraction and process model refined with watershed and u-distance method, a feature vector learning model using pyramidal pooling and residual network, and a question understanding model combining problem embedding coding method and machine translation decoding method. The feature vector could better represent the whole image instead of overly focused on specific characteristics. The model using semantic representation as input verifies that more accurate results can be obtained by introducing a highlevel semantic representation. The result also shows that it is feasible and effective to introduce high-level and abstract forms of knowledge representation into deep learning tasks. This study lays a theoretical and experimental foundation for introducing different levels of knowledge representation into deep learning in the future.

show abstract

Joint embedding VQA model based on dynamic word vector

Zheng

Chen

et al. 2021

127

View full text Add to dashboard Cite

The existing joint embedding Visual Question Answering models use different combinations of image characterization, text characterization and feature fusion method, but all the existing models use static word vectors for text characterization. However, in the real language environment, the same word may represent different meanings in different contexts, and may also be used as different grammatical components. These differences cannot be effectively expressed by static word vectors, so there may be semantic and grammatical deviations. In order to solve this problem, our article constructs a joint embedding model based on dynamic word vector—none KB-Specific network (N-KBSN) model which is different from commonly used Visual Question Answering models based on static word vectors. The N-KBSN model consists of three main parts: question text and image feature extraction module, self attention and guided attention module, feature fusion and classifier module. Among them, the key parts of N-KBSN model are: image characterization based on Faster R-CNN, text characterization based on ELMo and feature enhancement based on multi-head attention mechanism. The experimental results show that the N-KBSN constructed in our experiment is better than the other 2017—winner (glove) model and 2019—winner (glove) model. The introduction of dynamic word vector improves the accuracy of the overall results.

show abstract

Research on image classification method based on improved multi-scale relational network

2021

View full text Add to dashboard Cite

Small sample learning aims to learn information about object categories from a single or a few training samples. This learning style is crucial for deep learning methods based on large amounts of data. The deep learning method can solve small sample learning through the idea of meta-learning “how to learn by using previous experience.” Therefore, this paper takes image classification as the research object to study how meta-learning quickly learns from a small number of sample images. The main contents are as follows: After considering the distribution difference of data sets on the generalization performance of measurement learning and the advantages of optimizing the initial characterization method, this paper adds the model-independent meta-learning algorithm and designs a multi-scale meta-relational network. First, the idea of META-SGD is adopted, and the inner learning rate is taken as the learning vector and model parameter to learn together. Secondly, in the meta-training process, the model-independent meta-learning algorithm is used to find the optimal parameters of the model. The inner gradient iteration is canceled in the process of meta-validation and meta-test. The experimental results show that the multi-scale meta-relational network makes the learned measurement have stronger generalization ability, which further improves the classification accuracy on the benchmark set and avoids the need for fine-tuning of the model-independent meta-learning algorithm.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lirong Yin

Knowledge base graph embedding module design for Visual question answering model

Sentence Representation Method Based on Multi-Layer Semantic Network

Improving Visual Reasoning Through Semantic Representation

Joint embedding VQA model based on dynamic word vector

Research on image classification method based on improved multi-scale relational network

Contact Info

Product

Resources

About