According to researchers at the Mayo Clinic, more than 1/3 of Americans search the internet for medical information, and this proportion has been increasing slowly and steadily for several years.
Not surprisingly, patients with medical conditions are highly likely to search the internet for information; however, many clinicians also perform searches. There are several reputable sources, including the aforementioned Mayo Clinic, and there is no shortage of unreliable sources.
Google and Open AI’s ChatGPT have advanced to the point that they ‘understand’ questions asked in natural language (e.g., ‘Is there a pizza place near me that delivers?’). These applications often return surprisingly accurate answers.
Medicine Is Safety-Critical
No question asking about pizza delivery differs substantially from asking medical questions. In the artificial intelligence (AI) world, medical subject matter is considered ‘safety-critical.’ The phrase refers to errors or inaccuracies in the answers that can result in physical harm or failure to direct the patient in the right direction.
For these reasons, researchers are motivated to create medical AI that returns results as good (or better) as expert humans. In fields like pathology, which require (among other things) diagnosing diseases based on interpretations of microscopic images, AI has made substantial advances and is becoming a powerful tool for pathologists.
Doctor, Can I Ask You a Question?
Some experts would argue that the final frontier for AI is medical consultation. The patient has a sign or symptom and wants to ask an expert what they should do about it (if anything). Before the internet, the best option was to call a physician. Often, these calls would result in appointments for office visits.
[When I was practicing pediatrics, our friends often called with questions about their children. My wife had overheard me answer common questions so often that she would say, ‘Okay, here’s what he’s going to say…’ You could tell she was a pre-internet version of ChatGPT].
Today, most people ask questions to Google or ChatGPT before calling a doctor. Do Dr. Google and Dr. ChatGPT give answers that an actual healthcare provider would give? The accuracy of answers to medical questions can be measured, and that is precisely what researchers have been doing with AI.
In July 2023, Nature published a landmark study on a large language model (LLM) that answers medical questions. LLMs are the algorithms underlying ChatGPT and other search engines. LLMs can be trained to improve the quality of their answers over time via a complicated process of fine-tuning, input from developers, user feedback, and (importantly) addressing biases.
The authors of the Nature article used a process called ‘instruction prompt tuning’ to improve a medical AI called Flan-PaLM, which was already ‘smarter’ than previous AI algorithms but remained inferior to human clinicians regarding answers to medical questions. Instruction prompt tuning is a ‘machine learning’ form where the AI becomes refined and optimized over time.
The result was an algorithm called Med-PaLM. To test the accuracy of responses, the researchers asked common medical questions. They compared the answers given by Med-PaLM with human clinicians. The authors wanted to know if the AI understood the question correctly and gave appropriate advice. They also tested whether there was any evidence of bias in the answers that might result in answers that were not appropriate for certain demographic or ethnic groups.
The results were remarkable. Med-PaLM gave long-form answers that were 92.6% aligned with the scientific consensus, roughly equivalent to clinician-generated answers (92.9%). Similarly, 5.9% of Med-PaLM answers were rated as potentially leading to harmful outcomes, similar to the result for clinician-generated answers (5.7%).
The authors acknowledge that substantial work must be done on Med-PaLM before it can be used in the real world. Not least, the algorithm needs to understand questions asked by individuals with different backgrounds and levels of education to avoid providing misleading or inaccurate answers. Nevertheless, the main takeaway of the Nature study is that AI can answer medical questions posed in natural language in a manner consistent with actual human physicians.
Is This the End of the Medical Office Visit?
There are two essential pieces of a medical office visit: the history and the physical examination. The history (or, simply, the patient’s story) is the part that AI does so well. However, as good as AI has become, it has not yet replaced the physical examination, which a trained healthcare provider must perform. So, the office visit has not yet become a dinosaur.
Nevertheless, the history is the more important part regarding the solution to the patient’s problem. Many simple medical issues can be solved with virtual visits, as was demonstrated during the pandemic.
As AI continues to improve, physicians must confront the reality that AI can partially or replace their knowledge and experience. Physicians must work with AI-based tools and acknowledge that patients will consult AI before (and even after) visiting a clinic.
There are two consistent features in medicine: change and resistance to change. AI will continue to improve, and medical outcomes will improve simultaneously. Physicians may resist these changes but cannot argue with improved results (i.e., better quality of life, higher cure rates, longer symptom-free survival from diseases). Patients and physicians should welcome advances in medical AI. The best is yet to come.