How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation

Hendy, Amr; Abdelrehim, Mohamed; Sharaf, Amr; Raunak, Vikas; Gabr, M. A.; Matsushita, Hitokazu; Kim, Young Jin; Afify, Mohamed; Awadalla, Hany Hassan

doi:10.48550/arxiv.2302.09210

Cited by 31 publications

(40 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• In this analysis, I took a high-level approach to examining the failures of ChatGPT. However, for future investigations, it may be useful to focus on more specific categories of problems, such as sentiment analysis, named entity recognition, translation [27,28], summarization [66], and language ambiguity [47], in order to gain a more detailed understanding of ChatGPT's shortcomings in these areas [49].…”

Section: Discussionmentioning

confidence: 99%

A Categorical Archive of ChatGPT Failures

Borji¹

2023

Preprint

134

View full text Add to dashboard Cite

Large language models have been demonstrated to be valuable in different fields. ChatGPT, developed by OpenAI, has been trained using massive amounts of data and simulates human conversation by comprehending context and generating appropriate responses. It has garnered significant attention due to its ability to effectively answer a broad range of human inquiries, with fluent and comprehensive answers surpassing prior public chatbots in both security and usefulness. However, a comprehensive analysis of ChatGPT’s failures is lacking, which is the focus of this study. Eleven categories of failures, including reasoning, factual errors, math, coding, and bias, are presented and discussed. The risks, limitations, and societal implications of ChatGPT are also highlighted. The goal of this study is to assist researchers and developers in enhancing future language models and chatbots. Please refer to here for the list of questions.

show abstract

Section: Discussionmentioning

confidence: 99%

A Categorical Archive of ChatGPT Failures

Borji¹

2023

Preprint

134

View full text Add to dashboard Cite

show abstract

“…Meta's No Language Left Behind translates 200 different languages with high-quality results (Meta, 2022), and Google Translate, as of 2022, supports 133 languages, including 24 low-resource languages (Bapna, 2022). OpenAI's GPT models also emerge as excellent translators by generating context-relevant translation (Hendy et al, 2023). The approach proposed by Jung et al (2023) currently utilizes Google Translate API as well as GPT 3.5, which was shown to be capable of translating student responses with These examples demonstrate the impacts of LLMs and generative AI on automated scoring.…”

Section: Automated Scoringmentioning

confidence: 99%

Transforming Assessment: The Impacts and Implications of Large Language Models and Generative AI

Hao,

von Davier,

Yaneva

et al. 2024

Educational Measurement

View full text Add to dashboard Cite

The remarkable strides in artificial intelligence (AI), exemplified by ChatGPT, have unveiled a wealth of opportunities and challenges in assessment. Applying cutting‐edge large language models (LLMs) and generative AI to assessment holds great promise in boosting efficiency, mitigating bias, and facilitating customized evaluations. Conversely, these innovations raise significant concerns regarding validity, reliability, transparency, fairness, equity, and test security, necessitating careful thinking when applying them in assessments. In this article, we discuss the impacts and implications of LLMs and generative AI on critical dimensions of assessment with example use cases and call for a community effort to equip assessment professionals with the needed AI literacy to harness the potential effectively.

show abstract

“…The study also highlighted that GPT-3 achieved a relatively high level of accuracy in translating specialized religious text, with scores comparable to human translations in certain instances. Furthermore, Hendy et al (2023) evaluated GPT in the context of machine translation, exploring various dimensions such as the quality of different GPT models compared to state-of-the-art research and commercial systems, the impact of prompting strategies, robustness in the face of domain shifts, and document-level translation. The results suggested that GPT models demonstrated competitive translation quality for languages with ample resources but had limited capabilities when dealing with languages with scarce resources.…”

Section: Ai and Translationmentioning

confidence: 99%

Exploring the relationship between critical thinking, attitude, and anxiety in shaping the adoption of artificial intelligence in translation among Saudi translators

Mahdi,

Sahari

2024

JPR

View full text Add to dashboard Cite

Critical thinking and anxiety influenced the translation competence of translators. This study sought to examine the interactions between critical thinking, attitude, and anxiety influenced the translation competence of translators. This study adopted an empirical approach to collect data from 145 student translators from many colleges in Saudi Arabia. The questionnaire was used as a data collection tool. Data were analyzed by using structural equation modelling to find out the relationship between the study factors. The results indicated that there was a negative relationship between AI anxiety with critical thinking and attitude. However, there was a strong positive relationship between attitude with critical thinking, and Machine Translation anxiety. Also, there was a positive relationship between Machine Translation anxiety with AI anxiety and critical thinking.

show abstract

How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation

Cited by 31 publications

References 18 publications

A Categorical Archive of ChatGPT Failures

A Categorical Archive of ChatGPT Failures

Transforming Assessment: The Impacts and Implications of Large Language Models and Generative AI

Exploring the relationship between critical thinking, attitude, and anxiety in shaping the adoption of artificial intelligence in translation among Saudi translators

Contact Info

Product

Resources

About