Before choosing a clinical AI solution, make sure it is built on a framework of rigorous review and continuous advancement.
Artificial intelligence (AI) has potential to revolutionize the healthcare industry through solutions that transform how providers and patients interact with information. Yet, when it comes to clinical information and interpreting medical data that may have direct impact on patient care, the healthcare industry must weigh AI’s potential with its inherent risks.
In order to advance large language model (LLM)-powered clinical generative AI (GenAI) and take advantage of its promise to interpret vast amounts of medical data and deliver insights quickly, both solution developers and healthcare provider partners must emphasize the critical importance of accountability in AI-driven healthcare solutions. Trust can only be built on a strong foundation of rigor and responsibility.
The stakes are high for responsible AI in healthcare
Clinical AI has been known to misread queries or make up responses, and often reminds users in disclaimers that it’s still learning. “It’s going to take a while for these models to become more reliable, but I’m a firm believer in the art of innovation and technology scaling to overcome such hurdles,” says Manish Vazirani, Vice President of Clinical Effectiveness Product Software Engineering for Wolters Kluwer Health.
When investigating clinical LLM tools for their organizations, Vazirani advises healthcare leaders to look for AI-driven clinical decision support solutions that are rooted in the same rigorous standards as traditional expert-curated sources. Through robust clinical review and evidence-based content, traditional clinical decision support resources like UpToDate® develop reliable medical information. Clinical LLM-powered GenAI must strive to meet an equal standard, Vazirani explains, not by replacing traditional expertise but by complementing it with the speed and scalability that AI offers using a model trained on the same expert content, carefully curated and defined to help hone relevant responses.
When clinical GenAI operates without oversight, biases and incomplete data can influence its responses, leading to a decrease in the distinction between trustworthy and erroneous information. The likelihood of flawed information increases when clinical LLM-powered GenAI operates without proper oversight, leaving its responses vulnerable to biases and incomplete data. For this reason, Vazirani emphasizes the importance of developers implementing ongoing internal reviews aided or run by the right subject matter experts to confirm that the content generated by GenAI remains unaltered.
A responsible AI solution presents its own “unique dilemma” to engineers and clinical experts, Vazirani explains. “We’re trying to focus on responsible development over speed to get the balance of what ‘good’ looks like. And we have to consider additional prompts that take into account ethics and fairness.”