2023
DOI: 10.7759/cureus.49373
|View full text |Cite
|
Sign up to set email alerts
|

Pilot Testing of a Tool to Standardize the Assessment of the Quality of Health Information Generated by Artificial Intelligence-Based Models

Malik Sallam,
Muna Barakat,
Mohammed Sallam

Abstract: BackgroundArtificial intelligence (AI)-based conversational models, such as Chat Generative Pre-trained Transformer (ChatGPT), Microsoft Bing, and Google Bard, have emerged as valuable sources of health information for lay individuals. However, the accuracy of the information provided by these AI models remains a significant concern. This pilot study aimed to test a new tool with key themes for inclusion as follows: Completeness of content, Lack of false information in the content, Evidence supporting the cont… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 19 publications
(22 citation statements)
references
References 38 publications
0
22
0
Order By: Relevance
“…The subjective nature of evaluating clarity and overall correctness introduces an element of bias, warranting caution in the interpretation of results. Employing standardized tools for evaluating AI-generated output presents a superior alternative strategy (Sallam et al, 2023a). Furthermore, the study exclusively focused on medical microbiology, particularly medical virology, which warrants consideration, as generalizability to other academic disciplines may be restricted.…”
Section: Discussionmentioning
confidence: 99%
“…The subjective nature of evaluating clarity and overall correctness introduces an element of bias, warranting caution in the interpretation of results. Employing standardized tools for evaluating AI-generated output presents a superior alternative strategy (Sallam et al, 2023a). Furthermore, the study exclusively focused on medical microbiology, particularly medical virology, which warrants consideration, as generalizability to other academic disciplines may be restricted.…”
Section: Discussionmentioning
confidence: 99%
“…In this study, the use of the validated CLEAR tool for assessment of the quality of AI generated content presented a robust approach [38]. The rating of ChatGPT-4 as "Excellent" across all categories of completeness, accuracy/evidence, and appropriateness/relevance serves as a clear demonstration of its superiority.…”
Section: Discussionmentioning
confidence: 99%
“…To minimize subjectivity in the evaluation process, a consensus key response was formulated prior to assessment based on the query sources. The evaluation was based on the CLEAR tool across 5 components as follows: Completeness, Lack of false information (accuracy), Evidence-based content, Appropriateness, and Relevance [40]. Each component was assessed using a 5-point Likert scale ranging from 5 (excellent) to 1 (poor).…”
Section: Evaluation Of the Ai Generated Contentmentioning
confidence: 99%