Augmenting word2vec with latent Dirichlet allocation within a clinical application

Budhkar, Akshay; Rudzicz, Frank

doi:10.48550/arxiv.1808.03967

Cited by 1 publication

(1 citation statement)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, Back translations are also a common data augmentation technique and can generate diverse paraphrases while preserving the semantics of the original sentences. And Word2vec is another robust augmentation method that uses a word embedding model trained on the public data set to find the most similar words for a given input word, which is called Word2vec-based (learned semantic similarity) augmentation (Budhkar and Rudzicz, 2018). Table 1 shows some examples of text augmentation.…”

Section: Data Augmentationmentioning

confidence: 99%

Text Sentiment Analysis Based on Transformer and Augmentation

et al. 2022

View full text Add to dashboard Cite

With the development of Internet technology, social media platforms have become an indispensable part of people’s lives, and social media have been integrated into people’s life, study, and work. On various forums, such as Taobao and Weibo, a large number of people’s footprints are left all the time. It is these chats, comments, and other remarks with people’s emotional evaluations that make up part of public opinion. Analysis of this network public opinion is conducive to maintaining the peaceful development of society. Therefore, sentiment analysis has become a hot research field and has made great strides as one of the hot topics in the field of natural language processing. Currently, the BERT model and its variants have achieved excellent results in the field of NLP. However, these models cannot be widely used due to huge demands on computing resources. Therefore, this paper proposes a model based on the transformer mechanism, which mainly includes two parts: knowledge distillation and text augmentation. The former is mainly used to reduce the number of parameters of the model, reducing the computational cost and training time of the model, and the latter is mainly used to expand the task text so that the model can achieve excellent results in the few-sample sentiment analysis task. Experiments show that our model achieves competitive results.

show abstract

Section: Data Augmentationmentioning

confidence: 99%