2023
DOI: 10.2139/ssrn.4593895
|View full text |Cite
|
Sign up to set email alerts
|

A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

Katikapalli Subramanyam Kalyan
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 223 publications
0
6
0
Order By: Relevance
“…Additionally, the sheer number of parameters means that the model can potentially generate biased or inappropriate language, depending on the training data and prompts used. Moreover, GPT-3 is a proprietary model developed by OpenAI and is not currently available for download (Kalyan, 2023 ). Therefore, it should be used with caution and with careful prompt optimization to ensure accurate and unbiased results.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Additionally, the sheer number of parameters means that the model can potentially generate biased or inappropriate language, depending on the training data and prompts used. Moreover, GPT-3 is a proprietary model developed by OpenAI and is not currently available for download (Kalyan, 2023 ). Therefore, it should be used with caution and with careful prompt optimization to ensure accurate and unbiased results.…”
Section: Methodsmentioning
confidence: 99%
“…The GPT-3 architecture is based on the Transformer model, which was first introduced by Vaswani et al (2017). The Transformer model uses a self-attention mechanism that allows it to process input sequences in parallel, rather than sequentially (Kalyan, 2023). This makes it well-suited for processing long sequences of text, which is important for many NLP tasks, including language generation and sentiment analysis as shown in Figure 4.…”
Section: C) Gpt-3mentioning
confidence: 99%
“…In the past, traditional Seq2Seq frameworks based on RNN for generative models did not exhibit significant advantages in terms of accuracy and efficiency compared to extractive models. It was not until the recent widespread adoption of generative pre-trained models such as UniLM [29], BART [30], T5 [31], and GPT [32] that the development of effective generative information extraction models has gradually emerged as a forefront research direction. Extractive models are more susceptible to schema limitations, while generative models exhibit greater strength in terms of transferability and scalability compared to extractive models.…”
Section: Current Methods Of Information Extractionmentioning
confidence: 99%
“…Fine-tuning GPT-3 can produce powerful LLMs. The optimized GPT-3 has achieved very good performance in the field of NLP [15] [16]. However, compared with the BERT model, GPT-3 lacks sufficient bidirectional context modeling.…”
Section: Introductionmentioning
confidence: 99%