Shijia Kang scite author profile

Natural language processing is a research direction in many fields such as linguistics, computer science, and data fusion of study. The representation of word vector is a method to map words into the real vector space, which is the core technology of many current natural language processing tasks. This paper summarizes and studies some typical expression methods of word vector as well as research the word vectors of linguistics and mathematical principle. We first elaborate the process of mapping the words to the vector, namely, encoding natural language information to word vector according to semantics. Secondly, we analyze several typical methods such as co-occurrence matrix, Word2Vec, GloVe, ELMo on information carrying capacity. Thirdly, on the basis of analyzing the principles of these methods, this paper also uses SVD decomposition, neural network, and other methods respectively to reproduce the specific process of generating word vectors. Finally, combining word similarity calculation and text sentiment classification task, we compare the performance of word vectors trained by various methods in different tasks. Experiments verify the conclusion that different word vector generation methods have different emphases in carrying linguistic ability and perform differently in different tasks.

show abstract

The Research on Material Properties Database System Based on Network Sharing

Zheng

Kong

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shijia Kang

Research on IETM Data Publishing

Principle research of word vector representation in natural language processing

The Research on Material Properties Database System Based on Network Sharing

Contact Info

Product

Resources

About