Large language model applications for evaluation: Opportunities and ethical implications

Head, Cari Beth; Jasper, Paul; McConnachie, Matthew; Raftree, Linda; Higdon, Grace

doi:10.1002/ev.20556

Cited by 15 publications

(8 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The capacity of genAI models to provide complex responses is not necessarily proportional to the size and diversity of the resources included in the training dataset [8]. The selection of the sources included in a reference dataset results in biased responses, irrespective of whether the reference dataset has been curated by people (as in closed genAI models) (e.g., political bias [9][10][11][12], ethnocentricity [13][14][15], and sexism [16][17][18]), or whether genAI models are also enabled to extract of sources from the internet in general (as the latter overrepresents hegemonic viewpoints at the expense of those of marginalized communities [8,19]). This is not the place to discuss the numerous ethical issues that underlie the selection and use of original sources, such as alleged copyright violations [20,21] or whether primary or secondary sources underpin its 'knowledge' base [5,22].…”

Section: Background To the Generative Artificial Intelligence Languag...mentioning

confidence: 99%

“…Its updated version, ChatGPT (GPT-3.5), was made available to the public in November 2022 in a free research preview to stimulate experimentation [25] which was able to communicate in 95 languages [26]. The design reputedly included a safe answer mode to ensure that ChatGPT responded within ethical human values and did not provide harmful content (e.g., planning attacks, hate speech, and advice on suicide), although this is not assured [19,27,28]. Moreover, such filters can be intentionally subverted [29]).…”

Section: Background To the Generative Artificial Intelligence Languag...mentioning

confidence: 99%

See 1 more Smart Citation

Will Artificial Intelligence Affect How Cultural Heritage Will Be Managed in the Future? Responses Generated by Four genAI Models

Spennemann

2024

Heritage

View full text Add to dashboard Cite

Generative artificial intelligence (genAI) language models have become firmly embedded in public consciousness. Their abilities to extract and summarise information from a wide range of sources in their training data have attracted the attention of many scholars. This paper examines how four genAI large language models (ChatGPT, GPT4, DeepAI, and Google Bard) responded to prompts, asking (i) whether artificial intelligence would affect how cultural heritage will be managed in the future (with examples requested) and (ii) what dangers might emerge when relying heavily on genAI to guide cultural heritage professionals in their actions. The genAI systems provided a range of examples, commonly drawing on and extending the status quo. Without a doubt, AI tools will revolutionise the execution of repetitive and mundane tasks, such as the classification of some classes of artifacts, or allow for the predictive modelling of the decay of objects. Important examples were used to assess the purported power of genAI tools to extract, aggregate, and synthesize large volumes of data from multiple sources, as well as their ability to recognise patterns and connections that people may miss. An inherent risk in the ‘results’ presented by genAI systems is that the presented connections are ‘artifacts’ of the system rather than being genuine. Since present genAI tools are unable to purposively generate creative or innovative thoughts, it is left to the reader to determine whether any text that is provided by genAI that is out of the ordinary is meaningful or nonsensical. Additional risks identified by the genAI systems were that some cultural heritage professionals might use AI systems without the required level of AI literacy and that overreliance on genAI systems might lead to a deskilling of general heritage practitioners.

show abstract

Section: Background To the Generative Artificial Intelligence Languag...mentioning

confidence: 99%

Section: Background To the Generative Artificial Intelligence Languag...mentioning

confidence: 99%

Will Artificial Intelligence Affect How Cultural Heritage Will Be Managed in the Future? Responses Generated by Four genAI Models

Spennemann

2024

Heritage

View full text Add to dashboard Cite

show abstract

“…However, those traditional metrics may not fully capture the cultural and societal dimensions of text quality, which are important for the Chinese news context [59]. Advanced evaluation approaches have been proposed to address these limitations, incorporating cultural and societal value assessments into the evaluation process [60], [61]. Hybrid metrics that combine linguistic accuracy with cultural relevance have been developed to provide a more comprehensive evaluation of summarization models [62], [63].…”

Section: Evaluation Metrics For Summarization Qualitymentioning

confidence: 99%

Benchmarking Llama 3 for Chinese News Summation: Accuracy, Cultural Nuance, and Societal Value Alignment

Lu,

Hu,

Chen

2024

Preprint

View full text Add to dashboard Cite

Our benchmarking Llama 3 for Chinese news summarization is a novel approach that integrates cultural and ethical considerations into model evaluation, significantly enhancing the relevance and acceptability of the generated content. The study employs a comprehensive framework to assess accuracy, cultural understanding, and societal value compliance, providing a multifaceted evaluation of Llama 3’s capabilities. The results demonstrate that Llama 3 outperforms traditional and contemporary models, achieving high scores in ROUGE metrics and specialized cultural and ethical indices. Key findings highlight the importance of fine-tuning on culturally rich datasets and the use of advanced evaluation metrics to capture the complex interplay between language, culture, and ethics. Challenges encountered during the research underscore the need for continuous dataset updates and metric refinement, suggesting directions for future studies. The insights gained from this evaluation contribute to the broader field of natural language processing by showcasing the potential of advanced models to produce high-quality, culturally aware, and ethically compliant summaries.

show abstract

“…The existing literature suggests that using large language models for writing research papers raises ethical issues, including technological readiness, privacy, equality, the potential for discrimination and misinformation, intellectual property rights violations, and labor injustices, and the need for updated ethical frameworks to address these concerns [6,27,28].…”

Section: Ethical Issues Regarding Using Large Language Models To Writ...mentioning

confidence: 99%

“…Another primary concern is the potential for LLMs to create misinformation and discrimination. LLMs trained on biased datasets have the potential to propagate biased language and harmful beliefs [27]. The manipulation of LLMs to generate fraudulent research papers or change the public's opinion of scientific matters is called into doubt by this.…”

Section: Ethical Issues Regarding Using Large Language Models To Writ...mentioning

confidence: 99%

Generative AI, Research Ethics, and Higher Education Research: Insights from a Scientometric Analysis

Qadhi,

Alduais,

Chaaban

et al. 2024

Information

View full text Add to dashboard Cite

In the digital age, the intersection of artificial intelligence (AI) and higher education (HE) poses novel ethical considerations, necessitating a comprehensive exploration of this multifaceted relationship. This study aims to quantify and characterize the current research trends and critically assess the discourse on ethical AI applications within HE. Employing a mixed-methods design, we integrated quantitative data from the Web of Science, Scopus, and the Lens databases with qualitative insights from selected studies to perform scientometric and content analyses, yielding a nuanced landscape of AI utilization in HE. Our results identified vital research areas through citation bursts, keyword co-occurrence, and thematic clusters. We provided a conceptual model for ethical AI integration in HE, encapsulating dichotomous perspectives on AI’s role in education. Three thematic clusters were identified: ethical frameworks and policy development, academic integrity and content creation, and student interaction with AI. The study concludes that, while AI offers substantial benefits for educational advancement, it also brings challenges that necessitate vigilant governance to uphold academic integrity and ethical standards. The implications extend to policymakers, educators, and AI developers, highlighting the need for ethical guidelines, AI literacy, and human-centered AI tools.

show abstract

Large language model applications for evaluation: Opportunities and ethical implications

Cited by 15 publications

References 17 publications

Will Artificial Intelligence Affect How Cultural Heritage Will Be Managed in the Future? Responses Generated by Four genAI Models

Will Artificial Intelligence Affect How Cultural Heritage Will Be Managed in the Future? Responses Generated by Four genAI Models

Benchmarking Llama 3 for Chinese News Summation: Accuracy, Cultural Nuance, and Societal Value Alignment

Generative AI, Research Ethics, and Higher Education Research: Insights from a Scientometric Analysis

Contact Info

Product

Resources

About