Generative AI in a Universe of Medical Protocols

The story behind one of the most famous medical protocols of our era, about digital first responders, and how can Generative AI be used in an industry that is based on clinically validated protocols?

Feb 03, 2024

This is the unusual story behind a famous medical protocol. One that my team and I were closely involved in.

Protocols are used everywhere in medicine. Those are sets of guidelines and rules that healthcare professionals follow to ensure consistent and effective patient care. Those protocols can often be seen as decision trees, and they are used by clinicians to make decisions given the details of the clinical case.

There is clearly an inherent conflict between the non-deterministic nature of the answers coming from tools like ChatGPT and the need for clear deterministic protocols in medicine, but before we dive into that, I want to tell you the unusual story of one of the most famous medical protocols in the recent decade.

The Unusual Story of the CDC Protocol

One of the products built by my team was a platform that supported creating chatbots for healthcare. It was Generally Available and running in production at scale with many customers for several years, powering chatbots based on deterministic decision trees that customers could easily author.

When the COVID-19 virus hit, the impact on frontline healthcare workers was overwhelming. Knowing our chatbots product could quickly scale up and help healthcare systems automate some of the overload, I reached out to my leadership to convince them to use it more broadly.

It was early March 2020 when one of our existing customers, a US healthcare system, reached out to us – their nurse line was collapsing due to overload, and they asked for our help in building a quick assessment chatbot that could do initial risk evaluation of virus exposure, to help triage patients before they speak to a nurse.
And so, we did, and the chatbot went live in less than a week.

Soon thereafter, we started working with the Center of Disease Control (CDC). My software architect Arie and I worked closely with the CDC’s Chief Epidemiologist and her team, helping them to create the first CDC COVID-19 symptom checking protocol for screening people and assessing risk based on comorbidities and other factors.
We worked around the clock, through holiday and weekend. Being in a different time zone and 7 hours apart, we worked together during our nights, their nights, meeting 3 times a day to build the protocol together – a protocol that we immediately open-sourced. The CDC chatbot went live within 3 days.

At the Frontlines of the Digital First Responders

We did not stop for air after the CDC chatbot went live. I called my co-founder Gil and told him - you know what happens next - and he knew.

What happened next was that we had our medical scientists immediately translate the protocol into Hebrew, Arabic and Russian. Soon thereafter, the protocol was translated into more languages: European, Nordic, and Asian languages. We then created templates in the product to enable quicker development of COVID-19 response chatbots.

Soon, people started referring to us as Digital First-Responders. And we were at the frontlines of the digital first-responders of the pandemic, helping field teams all around the world with their go-live, around the clock. Not typical for an R&D team that in normal days focuses on health tech innovation and creating AI products.

We started keeping scores for how quickly healthcare systems went live. The first COVID-19 assessment bot was created in just four days. Then this timeline became two days, and then a customer went live with a bot in just one day: a hospital system in Belgium was at the top of our charts for going live within 24 hours using our technology. This was done thanks to the assistance of our field expert, Bert Hoorne. I later had the privilege of having Bert officially join our team. Belgium held to that championship for a full year before the Israeli Emergency Services took over the first place, going live with a COVID-19 quarantine guidelines protocol within 7 hours, from zero to full production. Why can’t we do soccer the way we do tech, beats me.

Right from the start, our company decided to create a special program for frontline healthcare organizations all around the world, giving them the service at no charge. That’s one of the most amazing things about working for a company that has such high moral values - it makes you so proud.

Scale became super high. During the first few months, we worked crazy hours, days and nights, to ensure the service met the needs. When a crisis hit India, we quickly deployed into the data center in India and went live within a few hours.

In a HLTH Matters interview, I shared that the protocol ended up being used by thousands of healthcare organizations all around the world. At some point we had 25 different Ministries of Health live on the service, from large countries to countries I had to look up on the map. Kids – do not skip your Geography classes.

2020 was a rough year for everyone. I spent that year working 16 to 20 hours a day (no kidding) from a small war room at my home office, becoming insomniac. My team and I constantly watched the devastating death toll go up, but also the telemetry of our system’s adoption, serving people all around the world.

Over the course of the two years this protocol was running live, we have served more than 2 billion chat conversations, providing service to more than 250 million people around the world.

I don’t know how many people we really helped save. What I do know is that for the frontline healthcare organizations, we’ve made a difference.

The Universe of Medical Protocols

Medical protocols are used very frequently in healthcare. Many times, protocols are formalized and validated clinically by boards of domain experts. They are typically based on research and clinical expertise and help in making informed decisions for specific clinical circumstances. Medical protocols can include medical guidelines for treatments, emergency procedures, clinical trial methods, risk assessment and more.

Some protocols are written as textual instructions. Some are formalized as decision trees or flow charts. They get refreshed and updated, and sometimes healthcare organizations have their own protocols for some of the things.

Example of a protocol: Adrenal nodule 2017 (Guidance for incidentally detected adrenal nodules on CT. Reference: Mayo-Smith WW, et al. Management of Incidental Adrenal Masses: A White Paper of the American College of Radiology Incidental Findings Committee. JACR 2017 Aug;14(8):1038-1044..

Medical protocols aim to provide clear and descriptive instructions. They aim to ensure consistent, high-quality patient care and safety. They function as comprehensive guidelines, providing details for diagnosis, treatment, and management of various medical conditions. This level of standardization is crucial in a field where decisions can imply life-or-death. Medical protocols are developed through research, real world evidence and consensus among healthcare experts, aiming to reflect the latest and most effective practices. They also play a key role in training and educating medical personnel. Protocols help in reducing variability in patient treatment, enabling more uniform standard of care.
In emergency situations, where quick decision-making is critical, these protocols can be life-saving by providing clear, step-by-step procedures. Protocols are key in improving overall healthcare systems by providing a benchmark for quality and performance assessment.

The non-deterministic nature of Generative AI

In their nature, medical protocols are deterministic. This is key in a healthcare setup, where predictability and consistency are crucial to ensure standard of care.

Compared to that, responses coming from Generative AI can be seen as unpredictable and non-deterministic. Here are some of the factors that contribute to the variability in answers:

Randomness in sampling methods: When generating text, models like ChatGPT use sampling methods to choose between several options to predict the next word in a sequence. These methods can introduce randomness, meaning that given the same prompt, the model might produce different outputs.
Model weights and training data: The large amount of data and the complexity of neural networks mean that even slight variations in training data or weights can lead to different outputs.
Temperature setting: The temperature parameter controls the randomness in the generation process. A higher temperature increases diversity in responses, leading to more variability (this is sometimes referred to as creativity).
Session state and context: The model might give different answers based on the preceding conversation, even if the same question is posed, due to changes in context or how the model interprets the user's intent over time.
Model versions: Over time, models are updated to improve performance, add features, or address issues. Changes in the model version can lead to differences in how questions are answered, even with the same prompt.

While in other industries this variability can be seen as creative, diverse, and even better user experience, this just does not fly in healthcare.

How can medical protocols and Generative AI co-exist?

Sure, you could provide the generative AI model with a reference to a protocol and ask it to follow that protocol, but how will you ensure that the protocol is actually followed?

What are you going to do with hallucinations – you certainly don’t want the generative AI model to make up steps or factors or parts of the protocol?

What about cases where you would like to enable generative AI answers as there are no protocols? What about cases where you’d like to enable RAG-based generative answers?

The interim conclusion here is that in a healthcare setup, you want to enable a side-by-side approach, combining protocol-based decision tree flows with Generative AI-based flows, preferably ones that are based on RAG that is grounded on credible content.

And back to 2020

At the end of 2020, our country got early access to COVID-19 vaccines, in an agreement with a pharma that involved sharing de-identified medical data of the entire population - an agreement that would later be referred to by some people as one of the largest post-market surveillance collaborations of the century.

On New Year’s Eve that ended 2020, late at night, I was standing in line for vaccine leftovers in a field clinic. I did not meet the criteria for early vaccination – not the age criteria, no chronic condition. But the early vaccine came in batches, and once they open a batch they had to use it all or throw it away at the end of the night shift. Typically, they would end up with a handful of leftovers that they gave to stand-by’s.

There were not too many leftovers that night. But on December 31^st, 2020, just minutes before midnight, I was the last person in the clinic to get the vaccine.