Why Grounding of Generative AI matters
How meeting Sam Altman led to the realization that Grounding can go wrong, and what to do about it.
It was a warm Saturday night in May 2023. I was walking back home from a pro-democracy rally, processing my thoughts. Something troubled me that evening, and I couldn’t quite put my finger on it. And then it hit me.
Some background first.
Earlier that week, Sam Altman of OpenAI visited our R&D center, and I got a chance to ask him about his view on the use of Generative AI in healthcare. At that time, I already had some mileage with Generative AI, being included in the project several months prior to that.
During the last decade I’ve been focused on building AI technologies for healthcare, driven by deep personal motivation to improve healthcare for humans all around the world. But this Generative AI was unique. The team and I have been experimenting with it for several months, looking into what it would take to adapt this technology to solve some of healthcare’s most challenging problems. The chat experience was super engaging, and the technology showed impressive results, like we’ve never seen before.
Few weeks earlier, Peter Lee, Carey Goldberg and Isaac Kohane published their book The AI Revolution in Medicine: GPT-4 and Beyond, inspiring us with a breadth of potential use cases. We were all excited. It felt like we had front row seats to the best show in town. Only that those were not front row seats - we were, in fact, on stage, playing.
One of the first things we looked at was Grounding. If you ever wondered where the AI gets its inspiration from, think about it this way: it uses grounding material. Grounding material is any kind of text, image, audio, or video that the AI uses as a reference. Grounding means providing relevant background material to the AI model to ensure that the result is closely linked to real-world context and maximizing the relevance of the generated output. For example, if you want the AI to answer a specific question about a certain medication, you might want to use grounding material about that medication, such as leaflets, relevant FDA data, and information about the therapeutic area.
We experimented with Retrieval-Augmented Generation (RAG) early on. RAG is a technique for enhancing the accuracy of Generative AI models with grounding information that is fetched from external sources. The main idea behind RAG is to search for relevant data in external content sources that contain up-to-date information, and then use that information as grounding material for the Generative AI model, before generating a response to the question. RAG allows you to bring your own content sources for the AI to rely upon when you build your chat experience.
But here's the catch: not all grounding material is created equal. Some grounding material might be biased, inaccurate, misleading, or even harmful.
And that was the thing that troubled me deeply on that Saturday night: the realization of what could happen if this technology would be abused by malicious entities using grounding material that contains conspiracy theories, propaganda, or fake news, to create Generative AI-based chat experiences that spread lies and misinformation. Not being naïve when it comes to cyber threats, the thing that concerned me even more was the realization that this might already be happening.
In the context of healthcare, this is even more disturbing, as this type of abuse could have immediate harmful consequences for people’s lives. Healthcare holds a higher risk, and if such an abuse is spread virally, it could even have an impact on population health. What if the grounding material contains false information about medications? What if false grounding is used to create a super engaging chat experience that convinces people to avoid certain treatments or encourages self-harm?
Deep breath. Stomachache.
The next day I dived into some deep testing, using Azure OpenAI. I was determined to see what would happen if I were to provide grounding material with false information into the model. I was synthesizing ridiculous material about the earth being flat, and some nonsense about the Covid vaccine implanting chips into people and causing them to grow tails as a side effect. Then throwing that at the model as grounding material to see what sticks.
What happened next was pretty cool. Azure OpenAI rejected my trolling attempts with a sophisticated response. Even though I asked it to limit its answers to only be based on the grounding that I provided, the model refused to use my grounding material to answer relevant questions, said it was wrong information, and gave correct answers instead. Classy. This demonstrated the model had built-in safeguards, at least against commonly known conspiracy theories and fake news.
So, what does this mean, are we all good now?
Not really, no. Because fake news and misinformation can be nuanced and not always easy to detect like the chip in the Covid vaccine nonsense. And because there are many Generative AI models out there, some of them are open source, and not all of them would necessarily employ similar rigorous safety controls and responsible AI measures like the ones that are built into Azure OpenAI.
Vetting grounding material that was used for Generative AI is critical. This can be done by checking the source, quality, and reliability of the grounding material, and looking for the evidence behind the answer. Grounding material should be factual, truthful, relevant, diverse, and respectful. It should not be outdated, biased, or harmful. Moreover, grounding material should come from credible sources and should go through fact-checking to eliminate misinformation.
Fake news and disinformation are the pandemic of our era, tearing societies, and creating bias in public opinion. This is why the recent Biden Executive Order on AI and the kind-of-parallel EU AI Act to enforce responsible use of AI are so timely and critical. Yet, both are far from perfect.
More on that soon.
Welcome to the Verge of Singularity
Verge of Singularity is a blog that discusses Artificial Intelligence technology, its application in health & life sciences and the implications of emerging AI technologies on health tech and society. Posts are based on personal experiences. Opinions are my own.