2023
DOI: 10.1002/ev.20556
|View full text |Cite
|
Sign up to set email alerts
|

Large language model applications for evaluation: Opportunities and ethical implications

Cari Beth Head,
Paul Jasper,
Matthew McConnachie
et al.

Abstract: Large language models (LLMs) are a type of generative artificial intelligence (AI) designed to produce text‐based content. LLMs use deep learning techniques and massively large data sets to understand, summarize, generate, and predict new text. LLMs caught the public eye in early 2023 when ChatGPT (the first consumer facing LLM) was released. LLM technologies are driven by recent advances in deep‐learning AI techniques, where language models are trained on extremely large text data from the internet and then r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 17 publications
0
8
0
Order By: Relevance
“…The capacity of genAI models to provide complex responses is not necessarily proportional to the size and diversity of the resources included in the training dataset [8]. The selection of the sources included in a reference dataset results in biased responses, irrespective of whether the reference dataset has been curated by people (as in closed genAI models) (e.g., political bias [9][10][11][12], ethnocentricity [13][14][15], and sexism [16][17][18]), or whether genAI models are also enabled to extract of sources from the internet in general (as the latter overrepresents hegemonic viewpoints at the expense of those of marginalized communities [8,19]). This is not the place to discuss the numerous ethical issues that underlie the selection and use of original sources, such as alleged copyright violations [20,21] or whether primary or secondary sources underpin its 'knowledge' base [5,22].…”
Section: Background To the Generative Artificial Intelligence Languag...mentioning
confidence: 99%
See 1 more Smart Citation
“…The capacity of genAI models to provide complex responses is not necessarily proportional to the size and diversity of the resources included in the training dataset [8]. The selection of the sources included in a reference dataset results in biased responses, irrespective of whether the reference dataset has been curated by people (as in closed genAI models) (e.g., political bias [9][10][11][12], ethnocentricity [13][14][15], and sexism [16][17][18]), or whether genAI models are also enabled to extract of sources from the internet in general (as the latter overrepresents hegemonic viewpoints at the expense of those of marginalized communities [8,19]). This is not the place to discuss the numerous ethical issues that underlie the selection and use of original sources, such as alleged copyright violations [20,21] or whether primary or secondary sources underpin its 'knowledge' base [5,22].…”
Section: Background To the Generative Artificial Intelligence Languag...mentioning
confidence: 99%
“…Its updated version, ChatGPT (GPT-3.5), was made available to the public in November 2022 in a free research preview to stimulate experimentation [25] which was able to communicate in 95 languages [26]. The design reputedly included a safe answer mode to ensure that ChatGPT responded within ethical human values and did not provide harmful content (e.g., planning attacks, hate speech, and advice on suicide), although this is not assured [19,27,28]. Moreover, such filters can be intentionally subverted [29]).…”
Section: Background To the Generative Artificial Intelligence Languag...mentioning
confidence: 99%
“…However, those traditional metrics may not fully capture the cultural and societal dimensions of text quality, which are important for the Chinese news context [59]. Advanced evaluation approaches have been proposed to address these limitations, incorporating cultural and societal value assessments into the evaluation process [60], [61]. Hybrid metrics that combine linguistic accuracy with cultural relevance have been developed to provide a more comprehensive evaluation of summarization models [62], [63].…”
Section: Evaluation Metrics For Summarization Qualitymentioning
confidence: 99%
“…The existing literature suggests that using large language models for writing research papers raises ethical issues, including technological readiness, privacy, equality, the potential for discrimination and misinformation, intellectual property rights violations, and labor injustices, and the need for updated ethical frameworks to address these concerns [6,27,28].…”
Section: Ethical Issues Regarding Using Large Language Models To Writ...mentioning
confidence: 99%
“…Another primary concern is the potential for LLMs to create misinformation and discrimination. LLMs trained on biased datasets have the potential to propagate biased language and harmful beliefs [27]. The manipulation of LLMs to generate fraudulent research papers or change the public's opinion of scientific matters is called into doubt by this.…”
Section: Ethical Issues Regarding Using Large Language Models To Writ...mentioning
confidence: 99%