2021
DOI: 10.3390/e23111422
|View full text |Cite
|
Sign up to set email alerts
|

Language Representation Models: An Overview

Abstract: In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 31 publications
0
8
0
Order By: Relevance
“…These acquired representations can subsequently be adapted for specific downstream tasks, including sentiment analysis, named entity recognition, and machine translation. [46,47,48,49] The success of generative pretraining in natural language processing can be attributed to its ability to learn rich and meaningful representations of language that can be used for a variety of downstream tasks.…”
Section: Generative Pretraining For Embedding Generationmentioning
confidence: 99%
“…These acquired representations can subsequently be adapted for specific downstream tasks, including sentiment analysis, named entity recognition, and machine translation. [46,47,48,49] The success of generative pretraining in natural language processing can be attributed to its ability to learn rich and meaningful representations of language that can be used for a variety of downstream tasks.…”
Section: Generative Pretraining For Embedding Generationmentioning
confidence: 99%
“…Pre-trained language models, such as BERT [ 23 ], BERT-WWM [ 24 ], RoBERTa [ 25 ], and NEZHA [ 26 ], have gradually become a fundamental technique for NLP, with great success on both English and Chinese tasks [ 27 ]. In our approach, we use the BERT and NEZHA feature extraction layers.…”
Section: Related Workmentioning
confidence: 99%
“…We adopt mini-bach mechanism to train our model. As seen in the two pictures above, in Figure 4, we have conducted many experiments by setting the batch size to [4,8,16,32,64], and, finally, we have chosen a better batch size of 8. As for the learning rate, we found that it is a better choice to choose different learning rates for different parameters through experiments.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…Using a representation language model pre-trained from large-scale unlabeled text is a universal and effective method in most natural language understanding tasks [ 16 ]. Most of these language models use self-supervised training methods.…”
Section: Related Workmentioning
confidence: 99%