Abstract:ObjectivesTo compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.DesignVignettes study.Setting200 primary care vignettes.Intervention/comparatorFor eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes’ gold-standard.Primary outcome measures(1) Proportion of conditions ‘covered’ by an app, that is, not excluded bec… Show more
“…The results of this study are in line with previous SC analyses [ 12 , 17 , 18 ]. Research supported by Ada Health GmbH shows that Ada had the highest top 3 suggestion diagnostic accuracy (70.5%) compared to other SCs [ 19 ], and the correct condition was among the first three results in 83% in an Australian assessment study [ 20 ]. Similarly to our results, the majority of patients would recommend Ada (85.3%) to friends or relatives [ 21 ].…”
Section: Discussionmentioning
confidence: 99%
“…In contrast to Rheport, Ada is supported by artificial intelligence and does not use a fixed questionnaire. Ada covers a great variety of different conditions [ 19 ] and is not limited to IRDs, whereas Rheport is exclusively meant for the triage of new suspected IRD patients. The study setting was deliberately chosen risk-adverse, so the use of the SCs did not have any clinical implications.…”
Background
Timely diagnosis and treatment are essential in the effective management of inflammatory rheumatic diseases (IRDs). Symptom checkers (SCs) promise to accelerate diagnosis, reduce misdiagnoses, and guide patients more effectively through the health care system. Although SCs are increasingly used, there exists little supporting evidence.
Objective
To assess the diagnostic accuracy, patient-perceived usability, and acceptance of two SCs: (1) Ada and (2) Rheport.
Methods
Patients newly presenting to a German secondary rheumatology outpatient clinic were randomly assigned in a 1:1 ratio to complete Ada or Rheport and consecutively the respective other SCs in a prospective non-blinded controlled randomized crossover trial. The primary outcome was the accuracy of the SCs regarding the diagnosis of an IRD compared to the physicians’ diagnosis as the gold standard. The secondary outcomes were patient-perceived usability, acceptance, and time to complete the SC.
Results
In this interim analysis, the first 164 patients who completed the study were analyzed. 32.9% (54/164) of the study subjects were diagnosed with an IRD. Rheport showed a sensitivity of 53.7% and a specificity of 51.8% for IRDs. Ada’s top 1 (D1) and top 5 disease suggestions (D5) showed a sensitivity of 42.6% and 53.7% and a specificity of 63.6% and 54.5% concerning IRDs, respectively. The correct diagnosis of the IRD patients was within the Ada D1 and D5 suggestions in 16.7% (9/54) and 25.9% (14/54), respectively. The median System Usability Scale (SUS) score of Ada and Rheport was 75.0/100 and 77.5/100, respectively. The median completion time for both Ada and Rheport was 7.0 and 8.5 min, respectively. Sixty-four percent and 67.1% would recommend using Ada and Rheport to friends and other patients, respectively.
Conclusions
While SCs are well accepted among patients, their diagnostic accuracy is limited to date.
Trial registration
DRKS.de, DRKS00017642. Registered on 23 July 2019
“…The results of this study are in line with previous SC analyses [ 12 , 17 , 18 ]. Research supported by Ada Health GmbH shows that Ada had the highest top 3 suggestion diagnostic accuracy (70.5%) compared to other SCs [ 19 ], and the correct condition was among the first three results in 83% in an Australian assessment study [ 20 ]. Similarly to our results, the majority of patients would recommend Ada (85.3%) to friends or relatives [ 21 ].…”
Section: Discussionmentioning
confidence: 99%
“…In contrast to Rheport, Ada is supported by artificial intelligence and does not use a fixed questionnaire. Ada covers a great variety of different conditions [ 19 ] and is not limited to IRDs, whereas Rheport is exclusively meant for the triage of new suspected IRD patients. The study setting was deliberately chosen risk-adverse, so the use of the SCs did not have any clinical implications.…”
Background
Timely diagnosis and treatment are essential in the effective management of inflammatory rheumatic diseases (IRDs). Symptom checkers (SCs) promise to accelerate diagnosis, reduce misdiagnoses, and guide patients more effectively through the health care system. Although SCs are increasingly used, there exists little supporting evidence.
Objective
To assess the diagnostic accuracy, patient-perceived usability, and acceptance of two SCs: (1) Ada and (2) Rheport.
Methods
Patients newly presenting to a German secondary rheumatology outpatient clinic were randomly assigned in a 1:1 ratio to complete Ada or Rheport and consecutively the respective other SCs in a prospective non-blinded controlled randomized crossover trial. The primary outcome was the accuracy of the SCs regarding the diagnosis of an IRD compared to the physicians’ diagnosis as the gold standard. The secondary outcomes were patient-perceived usability, acceptance, and time to complete the SC.
Results
In this interim analysis, the first 164 patients who completed the study were analyzed. 32.9% (54/164) of the study subjects were diagnosed with an IRD. Rheport showed a sensitivity of 53.7% and a specificity of 51.8% for IRDs. Ada’s top 1 (D1) and top 5 disease suggestions (D5) showed a sensitivity of 42.6% and 53.7% and a specificity of 63.6% and 54.5% concerning IRDs, respectively. The correct diagnosis of the IRD patients was within the Ada D1 and D5 suggestions in 16.7% (9/54) and 25.9% (14/54), respectively. The median System Usability Scale (SUS) score of Ada and Rheport was 75.0/100 and 77.5/100, respectively. The median completion time for both Ada and Rheport was 7.0 and 8.5 min, respectively. Sixty-four percent and 67.1% would recommend using Ada and Rheport to friends and other patients, respectively.
Conclusions
While SCs are well accepted among patients, their diagnostic accuracy is limited to date.
Trial registration
DRKS.de, DRKS00017642. Registered on 23 July 2019
“…Despite the mounting number of symptom checkers available and the adoption of this technology by various credible health institutions and entities such as the UK National Health Service (NHS) and the government of Australia [9,10], knowledge surrounding this technology is limited [11]. The scarce literature on symptom checker accuracy suggests that the quality of diagnostic and triage advice differs based on the digital platform used [12] with those enabled by artificial intelligence having a higher percentage of listing the correct diagnosis first [13].…”
Background
Young adults often browse the internet for self-triage and diagnosis. More sophisticated digital platforms such as symptom checkers have recently become pervasive; however, little is known about their use.
Objective
The aim of this study was to understand young adults’ (18-34 years old) perspectives on the use of the Google search engine versus a symptom checker, as well as to identify the barriers and enablers for using a symptom checker for self-triage and self-diagnosis.
Methods
A qualitative descriptive case study research design was used. Semistructured interviews were conducted with 24 young adults enrolled in a university in Ontario, Canada. All participants were given a clinical vignette and were asked to use a symptom checker (WebMD Symptom Checker or Babylon Health) while thinking out loud, and were asked questions regarding their experience. Interviews were audio-recorded, transcribed, and imported into the NVivo software program. Inductive thematic analysis was conducted independently by two researchers.
Results
Using the Google search engine was perceived to be faster and more customizable (ie, ability to enter symptoms freely in the search engine) than a symptom checker; however, a symptom checker was perceived to be useful for a more personalized assessment. After having used a symptom checker, most of the participants believed that the platform needed improvement in the areas of accuracy, security and privacy, and medical jargon used. Given these limitations, most participants believed that symptom checkers could be more useful for self-triage than for self-diagnosis. Interestingly, more than half of the participants were not aware of symptom checkers prior to this study and most believed that this lack of awareness about the existence of symptom checkers hindered their use.
Conclusions
Awareness related to the existence of symptom checkers and their integration into the health care system are required to maximize benefits related to these platforms. Addressing the barriers identified in this study is likely to increase the acceptance and use of symptom checkers by young adults.
“…The use of artificial intelligence (AI) is expected to reduce diagnostic errors in outpatients [ 6 , 7 ]. However, online symptom checkers, which generate AI-driven differential-diagnosis lists alone, failed to show high diagnostic accuracy [ 8 , 9 , 10 ]. On the other hand, a previous study demonstrated that providing AI-driven differential-diagnosis lists with basic patient information such as age, sex, risk factors, past medical history, and current reason for medical appointment could improve the diagnostic accuracy of physicians [ 11 ].…”
Background: The efficacy of artificial intelligence (AI)-driven automated medical-history-taking systems with AI-driven differential-diagnosis lists on physicians’ diagnostic accuracy was shown. However, considering the negative effects of AI-driven differential-diagnosis lists such as omission (physicians reject a correct diagnosis suggested by AI) and commission (physicians accept an incorrect diagnosis suggested by AI) errors, the efficacy of AI-driven automated medical-history-taking systems without AI-driven differential-diagnosis lists on physicians’ diagnostic accuracy should be evaluated. Objective: The present study was conducted to evaluate the efficacy of AI-driven automated medical-history-taking systems with or without AI-driven differential-diagnosis lists on physicians’ diagnostic accuracy. Methods: This randomized controlled study was conducted in January 2021 and included 22 physicians working at a university hospital. Participants were required to read 16 clinical vignettes in which the AI-driven medical history of real patients generated up to three differential diagnoses per case. Participants were divided into two groups: with and without an AI-driven differential-diagnosis list. Results: There was no significant difference in diagnostic accuracy between the two groups (57.4% vs. 56.3%, respectively; p = 0.91). Vignettes that included a correct diagnosis in the AI-generated list showed the greatest positive effect on physicians’ diagnostic accuracy (adjusted odds ratio 7.68; 95% CI 4.68–12.58; p < 0.001). In the group with AI-driven differential-diagnosis lists, 15.9% of diagnoses were omission errors and 14.8% were commission errors. Conclusions: Physicians’ diagnostic accuracy using AI-driven automated medical history did not differ between the groups with and without AI-driven differential-diagnosis lists.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.