According to Rebecca Mishuris, chief health information officer and vice president of digital at General Brigham’s Mass in Boston.
Productivity tools and secure access to large language models are among the most useful applications of AI in the health care system, she said in an interview earlier this month at the HIMSS Conference in Las Vegas.
For example, Mishuris noted that Mass General has seen strong adoption of Microsoft Copilot, which helps clinicians draft emails, summarize information and generate presentations.
She also noted that the health system has established secure internal access to large language models, which allows clinicians and researchers to safely experiment with AI while using protected health information. This access has already allowed researchers to create an AI agent that can summarize a new patient’s decades of medical records for clinicians before a visit, Mishuris said.
Overall, she said Mass General is showing “cautious optimism” when it comes to AI.
“We see the real transformative power of many generative AI applications, but we also work very hard to ensure that we deploy them safely, protecting the care we provide and the privacy and security of the data we use,” noted Mishuris.
Any deployment of AI must demonstrate a clear positive impact on the healthcare system without compromising these standards, she added.
For an AI deployment to be successful, Mishuris noted that health systems must align people, processes and technology. Technology alone isn’t enough: She said Mass General is investing heavily in staff AI training, helping employees understand what generative AI can and can’t do, how to use it safely and how it fits into workflows.
Once an AI solution is launched, it needs to be monitored on multiple levels, Mishuris said. She described three types of monitoring at Mass General: real-time monitoring during patient care to immediately detect potential hallucinations, short-term retrospective monitoring days or weeks later to review large-scale model results and identify potential problems, and ongoing performance monitoring to ensure tools continue to deliver expected results.
But overall, Mishuris emphasized that the measure of success depends on the problem the AI is intended to solve.
There is no universal measure of AI success: for example, a tool aimed at reducing clinician burnout should be evaluated differently than a tool designed to improve revenue cycle efficiency, she explained.
She also emphasized that AI should be judged against actual performance and not perfection.
When evaluating AI tools, the comparison should be how they perform against current workflows. In some cases, humans already make similar mistakes. The key question, then, is whether AI works as well or better than the status quo.
“There was actually a study in California which showed that humans hallucinate as much as the computer when it summarizes a patient’s discharge from the hospital. And so if you get a result like that, if it’s the same thing, if humans have hallucinations and computers have hallucinations, then what’s the risk of moving to computers? » remarked Michuris.
Ultimately, she said, the value of any AI tool depends on its ability to significantly improve workflows or patient care compared to the reality clinicians face today.
Photo: Malte Mueller, Getty Images
