Saturday, October 05, 2024

AI will not take your job but someone using AI will – it may well replace Doctors?

This paper (Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study by Goh et al.) on ‘diagnostic reasoning’ hasn’t had enough attention. The authors fully expected Doctors plus GenAI to win. But GPT 4 on its own beat Doctors hands down.

One of the authors made the point that the surprise was that the results broke that oft quoted trope that “AI will not take your job but someone using AI will”.

GenAI, for some time, has been beating medical students hands down on clinical exams. But can it outperform real Doctors?

They used a randomised design, with 50 physicians from various medical institutions. Three approaches were compared. The Doctors were randomised into two groups, then compared to GPT4 used on its own:

1. Docs + conventional resources
2. Docs + GPT-4 & conventional resources
3. GPT-4 alone

Each had 60 mins to complete up to 6 clinical problems. Their diagnostic reasoning was measured on differential diagnosis accuracy, supporting/opposing factors and their next diagnostic steps.

SHOCK RESULTS

When used WITHOUT human input, GPT4 scored 15.5% higher than the conventional resources group, outperforming both physicians and hybrid methods.



1. Docs + conventional resources only (73.7%)
2. Docs + GPT4 & conventional resource (76.3%)
3. GPT4 alone (89.2%)

Doctors using GPT4 alongside conventional resources showed only a marginal improvement in diagnostic accuracy over the conventional resources group. The GPT4 group also took less time per case.

It showed that GPT4 not only excels at real-world diagnostic reasoning, it also measured diagnostic reasoning through structured reflection, giving richer insights than simple accuracy. Remember those complaints about transparency and AI. Well, here we have it.

Sure, a limited sample at 50, also puzzling that GPT4 is better on its own that when used as an aid by the Doctors. Suggest that the Doctors are the confounding factor here? Turns out they were often doing is using GPT4 as a search engine.

GenAI is here to stay in medicine and may surpass that of trained Doctors.

CONCLUSION

Diagnosis rates among General Physicians stand at around 5%, sounds worse when you say 1 in 20. Let’s suppose AI, on its own, a UNIVERSAL DOCTOR has a misdiagnosis rate of only less than 1%. I'm sure this will happen, now that reasoning has arrived. At this point you’d be a damn fool to go to your Doctor.

No comments: