The Future of AI: Exploring Multi-Agent AI Systems

Microsoft

Sep 03, 2024

Welcome to our new series of blog posts about the future of AI! In these posts, my AI Futures team and I will be exploring the technologies that will take us to the next phase of AI.

AI agents are all the rage lately. With their ability to reason out complex tasks, AI agents can answer questions and take actions even without being explicitly programmed to do so. Lately, the concept of multi-agent systems is gaining traction for their ability to provide checks and balances. I recently wrote an example of a multi-agent system, in the form of a questionnaire-answering agent. I wrote it to automate the process of answering questions from RFPs or questionnaires, something my job commonly requires - but it also makes a great demo.

We've seen customers already using multi-agent systems for a few different use cases. One of our customers uses them to create accurate SQL queries for complex data analyses. Another one uses them to ensure their marketing content is correct and brand-aware. Yet another uses them to answer RFPs and questionnaires. I caught my questionnaire agents in action in this video:

I built this multi-agent system using Semantic Kernel and its new multi-agent functionality. This was actually the first time I'd tried to use Semantic Kernel, having previously used other agent frameworks like Azure OpenAI Assistants and AutoGen. I found Semantic Kernel to be remarkably elegant, and the amount of code I actually had to write to make this work was minimal. However, because the multi-agent framework in Semantic Kernel was brand new, it was challenging to find documentation or sample code (and GitHub Copilot knew nothing about it, having never seen it before). Nonetheless, I was able to put it together in just a couple of hours, and it works remarkably well, making this part of my job much easier.

The system comprises four distinct agents, the:

Question answerer
Answer Checker
Link Checker
Manager

Each plays a crucial role in ensuring the accuracy and reliability of the system's responses. The process begins with the Question Answerer agent, which attempts to provide an initial answer. This is then scrutinized by the Answer Checker agent. Both the Question Answerer and the Answer Checker agent are grounded in public data sources, as I gave them both the ability to search the web using Bing. The Link Checker agent ensures that any provided links are valid, addressing a common issue of hallucinated links. Finally, the Manager agent oversees the entire operation, making the final decision on the answer's validity.

As you can see in the above video, some questions can provoke an "agent debate," where different agents disagree with each other. When a debate occurs, the Manager agent forces a rewrite cycle. I capped this cycle at a total of 25 agent interactions - although only once have I been able to provoke such a long-running debate. Usually, the agents come to an agreement within one or two turns.

It's worth noting that, although I used GPT-4o from Azure OpenAI Service for all four agents, I could have used different models for each agent. This can provide economic benefits - for example, the Link Checker doesn't have to do that much work and maybe could use a small model like Phi-3. Using different models for debating agents can also result in different viewpoints, which can be beneficial. I could also have grounded my agents to a private dataset, instead of to the public web - but in this case, the questionnaires I was filling out require publicly accessible information, so the public web was the right choice.

Try it for yourself! Just fetch this GitHub repo and set the necessary environment variables to point to your own Azure OpenAI Service and Bing resources as shown in the Readme doc: mcasalaina/QuestionnaireMultiagent (github.com).