On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective

Wang, Jindong; Hu, Xixu; Hou, Wenhua; Chen, Hao; Zheng, Runkai; Wang, Yidong; Yang, Linyi; Huang, Haojun; Ye, Wei; Geng, Xiubo; Jiao, Binxin; Zhang, Yue; Xie, Xing

doi:10.48550/arxiv.2302.12095

Cited by 25 publications

(31 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…These instances fall outside the range of the model's training data and present challenges to the model's performance and generalization ability. Some of the recent research works focused on evaluating the robustness of GLLMs to out-of-distribution instances [456], [461], adversarial prompts [458]- [460] and adversarial inputs [425], [455], [457], [462] in one or more natural language processing tasks. Table 21 presents a summary of research works assessing GLLMs robustness to out-of-distribution instances, adversarial prompts and adversarial inputs.…”

Section: Robustness Of Gllmsmentioning

confidence: 99%

“…Moreover, ChatGPT demonstrates better robustness to adversarial inputs than SOTA models in text-to-SQL generation. Some of the research works evaluated the GLLM robustness in multiple natural language understanding and generation tasks [455], [456], [458], [460]. Chen et al [455] assessed the robustness of GPT-3 and GPT-3.5 models on 21 datasets covering nine natural language understanding tasks.…”

Section: Robustness Of Gllmsmentioning

confidence: 99%

“…The authors observed that the models are robust in tasks like machine reading comprehension and exhibit performance degradation of more than 35% in tasks like sentiment analysis and natural language inference. Wang et al [456] evaluated the robustness of GPT-3.5 and ChatGPT models on adversarial and out-of-distribution (OOD) samples on nine datasets covering four NLU tasks and machine translation. The authors observed that ChatGPT exhibits good performances on adversarial and OOD samples, but still, there is much room for improvement.…”

Section: Robustness Of Gllmsmentioning

confidence: 99%

“…In some of the tasks like data labelling [373]- [375], [383], text classification [144], relation extraction [156], question answering [132], [179], keyphrase generation [217], etc., these models achieved even SOTA results. However, some of the recent research works exposed the brittleness of these models towards out-of-distribution inputs [456], [461], adversarial prompts [458]- [460] and inputs [425], [455], [457], [462] . For example, Liu et al [461] reported that ChatGPT and GPT-4 perform well in multiple choice question answering but struggle to answer out-of-distribution questions.…”

Section: Future Research Directions 111 Enhance Robustness Of Gllmsmentioning

confidence: 99%

See 3 more Smart Citations

A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

Kalyan

2023

SSRN Journal

View full text Add to dashboard Cite

Section: Robustness Of Gllmsmentioning

confidence: 99%

Section: Robustness Of Gllmsmentioning

confidence: 99%

Section: Robustness Of Gllmsmentioning

confidence: 99%

Section: Future Research Directions 111 Enhance Robustness Of Gllmsmentioning

confidence: 99%

See 2 more Smart Citations

A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

Kalyan

2023

SSRN Journal

View full text Add to dashboard Cite

“…It has developed the ability to detect typical grammatical constructions and idioms after being trained on a vast corpus of text data, which includes books, papers, and websites [33]. This implies that even when the data it gets is not properly constructed or includes faults, it may nevertheless provide replies that are grammatically accurate and semantically relevant [34].…”

Section: E Natural Language Understandingmentioning

confidence: 99%

ChatGPT: A Brief Narrative Review

Gupta¹,

Mufti²,

Sohail³

et al. 2023

Preprint

View full text Add to dashboard Cite

Modern language models are created to produce writing that can be mistaken for sentences authored by humans. Moreover, these models can converse with humans in a way that seems fair and logical. The most technologically advanced chatbot to date is ChatGPT, a version of OpenAI's Generative Pretrained Transformer (GPT) language standard. It can generate high-quality content in mere seconds, surpassing the capabilities of other chatbots. As a result, it has generated a lot of attention, enthusiasm and interest in various sectors and topics. This study provides an overview of the current research on ChatGPT, including its technological framework, support mechanisms, and implementation studies. Through this review, we explore the advantages and limitations of ChatGPT and propose future research directions.

show abstract

T109 Evaluation of the GEM® Premier™ 5000 with intelligent quality management 2 (IQM®2) at Beijing Tsinghua Changgung Hospital, Tsinghua University (Beijing, China)

Tang²

2022

Clinica Chimica Acta

View full text Add to dashboard Cite

On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective

Cited by 25 publications

References 34 publications

A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

ChatGPT: A Brief Narrative Review

T109 Evaluation of the GEM® Premier™ 5000 with intelligent quality management 2 (IQM®2) at Beijing Tsinghua Changgung Hospital, Tsinghua University (Beijing, China)

Contact Info

Product

Resources

About