Purpose
This study aimed to assess the performance of ChatGPT, specifically the GPT-3.5 and GPT-4 models, in understanding complex surgical clinical information and its potential implications for surgical education and training.
Methods
The dataset comprised 280 questions from the Korean general surgery board exams conducted between 2020 and 2022. Both GPT-3.5 and GPT-4 models were evaluated, and their performances were compared using McNemar test.
Results
GPT-3.5 achieved an overall accuracy of 46.8%, while GPT-4 demonstrated a significant improvement with an overall accuracy of 76.4%, indicating a notable difference in performance between the models (P < 0.001). GPT-4 also exhibited consistent performance across all subspecialties, with accuracy rates ranging from 63.6% to 83.3%.
Conclusion
ChatGPT, particularly GPT-4, demonstrates a remarkable ability to understand complex surgical clinical information, achieving an accuracy rate of 76.4% on the Korean general surgery board exam. However, it is important to recognize the limitations of large language models and ensure that they are used in conjunction with human expertise and judgment.
Background. Microsatellite status is a prognostic biomarker in advanced gastric cancer. This retrospective study aimed to investigate the usefulness of microsatellite status in predicting prognosis and response to adjuvant treatment in pT1N1 gastric cancer. Patients and Methods. Among 875 patients who underwent radical gastrectomy for pT1N1 gastric cancer at two tertiary hospitals, 838 with available microsatellite instability (MSI) data were included and classified into two groups according to microsatellite status: microsatellite stable (MSS) and MSI-high (MSI-H). Recurrence-free survival rate and risk factors for tumor recurrence were analyzed. Results. Of 838 gastric cancer patients, 100 (11.9%) were MSI-H and 307 (36.6%) received adjuvant treatment. During median follow-up of 70 months, 42 (5.0%) patients
Background: Pure laparoscopic donor hepatectomy (PLDH) has become a standard procurement practice for living donor liver transplantation in expert centers. During the procedures of PLDH, a good anatomical approach for donor bile duct division is crucial to avoid multiple bile duct openings, which increases the risk of biliary complications for the recipient. This study was designed to develop a deep learning-based artificial intelligence model to identify biliary structures intraoperatively, helping to determine the optimal transaction site. Methods: Semantic segmentation of the bile duct was performed using a convolutional neural network-based approach. Deep-LabV3+ was utilized as the model with the ResNet as a backbone. Ground truth annotations were generated with the help of images of the bile duct under infrared fluoroscopy with indocyanine green by a single surgeon. The dice coefficient was utilized as an evaluation metric for the proposed model. Results: Three hundred images of the biliary structure were extracted from 30 PLDH videos, 80% of images were used as train dataset, and 20% were used for validation dataset. As a result, the model predicted the area of the bile duct with a precision of 0.66. Conclusions: Intraoperative artificial intelligence-guided bile duct division can be used for PLDH. This technology may provide real-time guidance and improve surgical outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.