BACKGROUND
Disadvantaged cancer survivors and their caregivers (e.g., individuals with limited health literacy, racial and ethnic minorities facing language barriers) face a disproportionately increased risk of symptom burden from cancer and its treatments. Large language models (LLMs) offer researchers an opportunity to develop educational materials tailored to these populations.
OBJECTIVE
The purposes of this study were to: 1) evaluate the overall performance of LLMs in generating tailored educational content for disadvantaged cancer survivors and their caregivers; 2) compare the performances of three Generative Pre-trained Transformer (GPT) models (i.e., GPT-3.5 Turbo, GPT-4, GPT-4 Turbo); and 3) explore different prompts that can help LLMs generate better content.
METHODS
We selected 30 topics from national guidelines on cancer care and education. GPT-3.5 Turbo, GPT-4, and GPT-4 Turbo were used to generate tailored content of up to 250 words at a 6th-grade reading level, with translations into Spanish and Chinese for each topic. Nine oncology experts evaluated the content based on pre-determined criteria: word limit, reading level, and quality assessment (i.e., clarity, accuracy, relevance, completeness, and comprehensibility). ANOVA or Chi-square analyses were employed to compare differences among the various GPT models and prompts.
RESULTS
Overall, LLMs showed excellent performance in tailoring educational content, with 74.2% (n=360) adhering to the specified word limit and achieving an average quality assessment score of 8.933 out of 10. However, LLMs showed moderate performance in reading level, with 41.1% of content failing to meet the 6th-grade reading level. LLMs demonstrated strong translation capabilities, achieving an accuracy of 88.9% for Spanish and 81.1% for Chinese translations. The more advanced GPT-4 family models showed better overall performance compared to GPT-3.5 Turbo. Prompting GPTs to produce bulleted-format content was likely to result in better educational materials compared to textual-format content.
CONCLUSIONS
This study highlights the application of LLMs in cancer care and education while acknowledging their potential limitations. The findings can inform the development and implementation of interventions in cancer symptom management and supportive care, thereby advancing health equity.
INTERNATIONAL REGISTERED REPORT
RR2-10.2196/48499