Background: Natural language processing models such as ChatGPT can generate text-based content and are poised to become a major information source in medicine and beyond. The accuracy and completeness of ChatGPT for medical queries is not known.
Methods: Thirty-three physicians across 17 specialties generated 284 medical questions that they subjectively classified as easy, medium, or hard with either binary (yes/no) or descriptive answers. The physicians then graded ChatGPT-generated answers to these questions for accuracy (6-point Likert scale; range 1 – completely incorrect to 6 – completely correct) and completeness (3-point Likert scale; range 1 – incomplete to 3 - complete plus additional context). Scores were summarized with descriptive statistics and compared using Mann-Whitney U or Kruskal-Wallis testing.
Results: Across all questions (n=284), median accuracy score was 5.5 (between almost completely and completely correct) with mean score of 4.8 (between mostly and almost completely correct). Median completeness score was 3 (complete and comprehensive) with mean score of 2.5. For questions rated easy, medium, and hard, median accuracy scores were 6, 5.5, and 5 (mean 5.0, 4.7, and 4.6; p=0.05). Accuracy scores for binary and descriptive questions were similar (median 6 vs. 5; mean 4.9 vs. 4.7; p=0.07). Of 36 questions with scores of 1-2, 34 were re-queried/re-graded 8-17 days later with substantial improvement (median 2 vs. 4; p<0.01).
Conclusions: ChatGPT generated largely accurate information to diverse medical queries as judged by academic physician specialists although with important limitations. Further research and model development are needed to correct inaccuracies and for validation.
Using a novel subcutaneous dosing schedule, ustekinumab was successful in improving clinical, laboratory, and endoscopic markers of disease activity in patients with severe, refractory CD.
C(H)2 region deletion of HuCC49DeltaC(H)2 MAb did not alter the pharmacokinetics compared to murine CC49. The favorable partition coefficient K of HuCC49DeltaC(H)2 MAb into tumors supports its use in RIGS.
Immune checkpoint inhibitors (ICI) predispose patients to immune-related adverse events (irAEs). Although hepatitis is a potentially lethal toxicity, the timing and outcomes have not been well described. In this retrospective study, patients from six international institutions were included if they were treated with ICIs and developed immune-related hepatitis. Patient and tumor characteristics, and hepatitis management and outcomes were evaluated. Of the 164 patients included, most were male (53.7%) with a median age of 63.0 years. Most patients had melanoma (83.5%) and stage IV disease (86.0%). Median follow-up was 585 days; median OS and PFS were not reached. The initial grade of hepatitis was most often grade 2 (30.5%) or 3 (45.7%) with a median time to onset of 61 days. Patients were most commonly asymptomatic (46.2%), but flu-like symptoms, including fatigue/anorexia (17.1%), nausea/emesis (14.0%), abdominal/back pain (11.6%), and arthralgias/myalgias (8.5%) occurred. Most patients received glucocorticoids (92.1%); the median time to improvement by one grade was 13.0 days, and the median time to complete resolution was 52.0 days. Second-line immunosuppression was required in 37 patients (22.6%), and steroid-dose re-escalation in 45 patients (27.4%). Five patients (3%) died of ICI-hepatitis or complications of hepatitis treatment. Ninety-one patients (58.6%) did not resume ICI; of 66 patients (40 grade 1/2, 26 grade 3/4) that were rechallenged, only 25.8% (n = 17) had recurrence. In this multi-institutional cohort, immune-related hepatitis was associated with excellent outcomes but frequently required therapy discontinuation, high-dose steroids, and second-line immunosuppression. Rechallenge was associated with a modest rate of hepatitis recurrence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.