2024
DOI: 10.1186/s40537-023-00842-0
|View full text |Cite
|
Sign up to set email alerts
|

Survey of transformers and towards ensemble learning using transformers for natural language processing

Hongzhi Zhang,
M. Omair Shafiq

Abstract: The transformer model is a famous natural language processing model proposed by Google in 2017. Now, with the extensive development of deep learning, many natural language processing tasks can be solved by deep learning methods. After the BERT model was proposed, many pre-trained models such as the XLNet model, the RoBERTa model, and the ALBERT model were also proposed in the research community. These models perform very well in various natural language processing tasks. In this paper, we describe and compare … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…• Model architecture: The GPT models are based on a transformer architecture, which consists of multiple layers of self-attention mechanisms and feedforward neural networks. 79 The number of layers, hidden units, attention heads, and other architectural parameters can vary depending on the size and complexity of the model. • Pre-training data: Models pre-trained on large amounts of text data to learn language representations can be useful.…”
Section: Parameters Used In the Development Of Gpt For Medicinementioning
confidence: 99%
See 1 more Smart Citation
“…• Model architecture: The GPT models are based on a transformer architecture, which consists of multiple layers of self-attention mechanisms and feedforward neural networks. 79 The number of layers, hidden units, attention heads, and other architectural parameters can vary depending on the size and complexity of the model. • Pre-training data: Models pre-trained on large amounts of text data to learn language representations can be useful.…”
Section: Parameters Used In the Development Of Gpt For Medicinementioning
confidence: 99%
“…There are several alternatives to GPT for NLP tasks, each with its own strengths and weaknesses. 79 Here are some notable alternatives:…”
Section: Alternatives To Gptmentioning
confidence: 99%