Survey of transformers and towards ensemble learning using transformers for natural language processing

Zhang, Hongzhi; Shafiq, M. Omair

doi:10.1186/s40537-023-00842-0

Search citation statements

Order By: Relevance

Paper Sections

Select...

Parameters Used In the Development Of Gpt For Medicine1

Alternatives To Gpt1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article3

Preprint1

Relationship

Self Cite0

Independent4

Authors

Journals

Cited by 4 publications

(2 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• Model architecture: The GPT models are based on a transformer architecture, which consists of multiple layers of self-attention mechanisms and feedforward neural networks. 79 The number of layers, hidden units, attention heads, and other architectural parameters can vary depending on the size and complexity of the model. • Pre-training data: Models pre-trained on large amounts of text data to learn language representations can be useful.…”

Section: Parameters Used In the Development Of Gpt For Medicinementioning

confidence: 99%