ChatGPT- versus human-generated answers to frequently asked questions about diabetes: a Turing test-inspired survey among employees of a Danish diabetes center

Hulmán, Ádám; Dollerup, Ole Lindgård; Mortensen, Jesper Friis; Fenech, Matthew; Norman, Kasper; Støvring, Henrik; Hansen, Troels Krarup

doi:10.1101/2023.02.13.23285745

Cited by 7 publications

(6 citation statements)

References 27 publications

(36 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The outcome of the game rests on two interrelated factors: first, the capacity of the machine to produce communicative artefacts that imitate the attributes of those produced by humans. As has been alluded to already, and will be reviewed in more detail below, there is emerging evidence that ChatGPT currently possesses ample capacity to interpret text prompts and produce sophisticated human-like texts (Gao et al, 2023;Hulman et al, 2023;Nov et al, 2023); second, and the focus of the current study, is the degree to which the human interrogator is sensitive to the attributes of communicative artefacts that indicate whether they are produced by a human or a machine.…”

Section: Introductionmentioning

confidence: 93%

“…Another recent study inspired by Turing's Imitation Game was undertaken by Hulman et al (2023) among 183 employees of a large health provider service in Denmark. The objective of the study was to determine how adequately ChatGPT could answer 10 frequently asked questions that were of relevance to the healthcare service (i.e., questions about diabetes).…”

Section: The Imitation Game Paradigm To Investigate Chatgptmentioning

confidence: 99%

“…ChatGPT presents a chat interface into which users can enter text prompts and be immediately provided with bespoke text outputs. The application is surprisingly powerful; simple text prompts can be used to produce complex human-like written responses (Gao et al, 2023;Hulman et al, 2023;Nov et al, 2023).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Academics' perceptions of ChatGPT-generated written outputs: A practical application of Turing’s Imitation Game

A Matthews,

Volpe

2023

AJET

View full text Add to dashboard Cite

Artificial intelligence (AI) technology, such as Chat Generative Pre-trained Transformer (ChatGPT), is evolving quickly and having a significant impact on the higher education sector. Although the impact of ChatGPT on academic integrity processes is a key concern, little is known about whether academics can reliably recognise texts that have been generated by AI. This qualitative study applies Turing’s Imitation Game to investigate 16 education academics’ perceptions of two pairs of texts written by either ChatGPT or a human. Pairs of texts, written in response to the same task, were used as the stimulus for interviews that probed academics’ perceptions of text authorship and the textual features that were important in their decision-making. Results indicated academics were only able to identify AI-generated texts half of the time, highlighting the sophistication of contemporary generative AI technology. Academics perceived the following categories as important for their decision-making: voice, word usage, structure, task achievement and flow. All five categories of decision-making were variously used to rationalise both accurate and inaccurate decisions about text authorship. The implications of these results are discussed with a particular focus on what strategies can be applied to support academics more effectively as they manage the ongoing challenge of AI in higher education. Implications for practice or policy: Experienced academics may be unable to distinguish between texts written by contemporary generative AI technology and humans. Academics are uncertain about the current capabilities of generative AI and need support in redesigning assessments that succeed in providing robust evidence of student achievement of learning outcomes. Institutions must assess the adequacy of their assessment designs, AI use policies, and AI-related procedures to enhance students’ capacity for effective and ethical use of generative AI technology.

show abstract

Section: Introductionmentioning

confidence: 93%

Section: The Imitation Game Paradigm To Investigate Chatgptmentioning

confidence: 99%

See 1 more Smart Citation

Academics' perceptions of ChatGPT-generated written outputs: A practical application of Turing’s Imitation Game

A Matthews,

Volpe

2023

AJET

View full text Add to dashboard Cite

show abstract

“…FActScores [29] evaluates text generated by LLMs via an evaluation method that breaks a generation into atomic facts which are in turn evaluated by human evaluators. Majority voting for evaluating healthcare related answers generated by LLMs has been employed for myopia care [21], maternity [21], diabetes [26], cancer [27], infant care [21] etc.…”

Section: B Measuring Hallucinationsmentioning

confidence: 99%

Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI

Ahmad,

Yaramic,

Roy

2023

Preprint

View full text Add to dashboard Cite

Large language models have proliferated across multiple domains in as short period of time. There is however hesitation in the medical and healthcare domain towards their adoption because of issues like factuality, coherence, and hallucinations. Give the high stakes nature of healthcare, many researchers have even cautioned against its usage until these issues are resolved. The key to the implementation and deployment of LLMs in healthcare is to make these models trustworthy, transparent (as much possible) and explainable. In this paper we describe the key elements in creating reliable, trustworthy, and unbiased models as a necessary condition for their adoption in healthcare. Specifically we focus on the quantification, validation, and mitigation of hallucinations in the context in healthcare. Lastly, we discuss how the future of LLMs in healthcare may look like.

show abstract

“…A study conducted by Hulman et al. (2023) to evaluate the answers given by ChatGPT for diabetes‐related questions against answers given by humans, reports that ChatGPT's answers to two out of 10 diabetes‐related questions contain misinformation. With respect to this, the quality of the answers provided by more traditional CHeQA approaches as described in this survey, whose answers come from validated scientific content, may be higher than those provided by systems based on generic LLMs or PLMs fine‐tuned on smaller sets of domain‐specific data.…”

Section: Future Directionsmentioning

confidence: 99%

A survey of consumer health question answering systems

Welivita,

2023

AI Magazine

View full text Add to dashboard Cite

Consumers are increasingly using the web to find answers to their health‐related queries. Unfortunately, they often struggle with formulating the questions, further compounded by the burden of having to traverse long documents returned by the search engine to look for reliable answers. To ease these burdens for users, automated consumer health question answering systems try to simulate a human professional by refining the queries and giving the most pertinent answers. This article surveys state‐of‐the‐art approaches, resources, and evaluation methods used for automatic consumer health question answering. We summarize the main achievements in the research community and industry, discuss their strengths and limitations, and finally come up with recommendations to further improve these systems in terms of quality, engagement, and human‐likeness.

show abstract

ChatGPT- versus human-generated answers to frequently asked questions about diabetes: a Turing test-inspired survey among employees of a Danish diabetes center

Cited by 7 publications

References 27 publications

Academics' perceptions of ChatGPT-generated written outputs: A practical application of Turing’s Imitation Game

Academics' perceptions of ChatGPT-generated written outputs: A practical application of Turing’s Imitation Game

Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI

A survey of consumer health question answering systems

Contact Info

Product

Resources

About