Educator Developer Blog

8 MIN READ

Building AI Agent Applications Series - Using AutoGen to build your AI Agents

Microsoft

Feb 09, 2024

In the previous content, we learned about AI Agent. If you didn't read it, please read my previous content - Understanding AI Agents. We have many different frameworks to implement AI Agents. AutoGen from Microsoft is a relatively mature AI Agents framework. Now AutoGen is mainly based on two programming languages .NET and Python. The more mature version is the Python version. The content in this article is mainly based on the Python version https://microsoft.github.io/autogen. If you want to learn the .NET version, you can visit here https://microsoft.github.io/autogen-for-net

AutoGen Features

From the perspective of AI agents, AutoGen has the ability to be compatible with different LLMs, tool chains for different tasks, and human-computer interaction capabilities. It is an open source framework used to solve the interaction between agents. Its biggest feature is its ability to automate task orchestration, optimize workflow, and have powerful multi-agent conversation capabilities to adjust to workflow or goals. Combined with the different APIs provided in the framework, the cache construction, error handling, different LLMs configurations, context association, dialogue process settings required by the AI agent . Compared with Semantic Kernel and LangChain, the frameworks based on Copilot applications, AutoGen has more advantages in automated task orchestration scenarios and is more front-end-oriented. After receiving the target task, AutoGen can arrange the task, and Semantic Kernel or LangChain are more like providing an ammunition library to solve the task arrangement process, providing various tools and methods that can support the completion of the task.

The construction of AutoGen is very simple. You only need simple code to quickly configure the agent. By building simple anthropomorphic user agents and assistants, you can complete the construction of a simple agent. Here's how to quickly build a single agent

1. Configuration file, AutoGen. For configuration files, Azure OpenAI Service is generally placed in the AOAI_CONFIG_LIST in the root directory, such as


[
    {
        "model": "Your Azure OpenAI Service Deployment Model Name",
        "api_key": "Your Azure OpenAI Service API Key",
        "base_url": "Your Azure OpenAI Service Endpoin",
        "api_type": "azure",
        "api_version": "Your Azure OpenAI Service version, eg 2023-12-01-preview"
    },
    {
        "model": "Your Azure OpenAI Service Deployment Model Name",
        "api_key": "Your Azure OpenAI Service API Key",
        "base_url": "Your Azure OpenAI Service Endpoin",
        "api_type": "azure",
        "api_version": "Your Azure OpenAI Service version, eg 2023-12-01-preview"
    },
    {
        "model": "Your Azure OpenAI Service Deployment Model Name",
        "api_key": "Your Azure OpenAI Service API Key",
        "base_url": "Your Azure OpenAI Service Endpoin",
        "api_type": "azure",
        "api_version": "Your Azure OpenAI Service version, eg 2023-12-01-preview"
    }
]

If it is OpenAI Service, OAI_CONFIG_LIST placed in the root directory, the content includes


[
    {
        "model": "Your OpenAI Model Name",
        "api_key": "Your OpenAI API Key"
    },
    {
        "model": "Your OpenAI Model Name",
        "api_key": "Your OpenAI API Key"
    },
    {
        "model": "Your OpenAI Model Name",
        "api_key": "Your OpenAI API Key"
    },
]

After completing the configuration, you can use Python to initial


config_list = autogen.config_list_from_json(
    env_or_file="AOAI_CONFIG_LIST",
    file_location=".",
    filter_dict={
        "model": {
            "Your Model list"

        }
    },
)

2. Create user proxy agent and assistant agent


# Create an AssistantAgent instance named "assistant"
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config={
        "cache_seed": 42,
        "config_list": config_list,
    }
)
# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="ALWAYS"
)

Notice

A. The AI agent assistant corresponds to the configuration file and adds a cache. The role of the configuration is to give the agent a powerful "brain" and cache memory.

B. User proxy agent can simulate human behavior, and you can set whether human intervention occurs. We know that the characteristics of AI agents are not only human thinking, but also human interactive behaviors. When we solve problems through AI agents, we need to consider whether human intervention is needed. You can choose Never. But sometimes you must choose ALWAYS. Because when we need to obtain some APIs, we need some Keys or the cooperation of some network address files. You can set it according to your own scene.

3. The last step is to associate the user proxy agent and the assistant agent together and give them a task


messages = "tell me today's top 10 news in the world "

user_proxy.initiate_chat(assistant, message=messages)

We can clearly see how the agent completes the complete interaction and generates code to obtain today's latest news. If you want to learn this example, please visit

https://github.com/kinfey/AutoGenDemo/blob/main/agent_demo_step01.ipynb

AutoGen scenarios

There are many implementation scenarios based on AutoGen. You can learn from the cases in https://microsoft.github.io/autogen/docs/Examples. I would like to tell you how to use AutoGen from two scenarios and how Autogen works through a detailed application scenario.

Scenarios

Case 1: Combining multi-modal capabilities to complete object detection

Requirement: During our production process, we need to conduct safety helmet detection. If you find that employees are not wearing safety helmets, please mark it.

From traditional AI applications, what we need is to collect data of people wearing helmets, label them, complete the model through deep learning training, and then inference and label it. Now that we have multimodal models, we can simplify a lot of our work. In this scenario, we can combine multi-modal agents, code agents, and running-code agents to complete related work.

AutoGen supports group chat, and multiple agents can be combined to complete tasks in a session. The code is as follows:


groupchat = autogen.GroupChat(agents=[user_proxy, checker,coder, commander], messages=[], max_round=10)

manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

Group chat can combine different agents to complete tasks, which is very interesting work. If you want to know more, please visit

https://github.com/kinfey/AutoGenDemo/blob/main/agent_demo_step02.ipynb

Case 2: AutoGen powered by Assistant API

The Assistant API is designed for AI agents. You can build AI agent applications with less code through the Assistant API. It integrates capabilities such as state management, context association, chat threads, and code execution, making it easier to access third-party extensions ( Code interpreter, knowledge retrieval and Function Calling, etc.). Although AutoGen already had similar functions before the Assistant API, with the support of the Assistant API, AutoGen can be more flexibly defined in multi-agent scenarios, can set more interactive scenarios, more flexible task execution, and more Good end-to-end process management.

Note: Before the article was written, AutoGen did not support the Assistant API provided on Azure OpenAI Service, so this article will be completed based on the OpenAI Assistant API.

Before using the Assistant API, the relevant Assistants must be created on the OpenAI or Azure OpenAI Service portal. For details, please refer to https://learn.microsoft.com/azure/ai-services/openai/assistants-quickstart

Using the Assistant API in AutoGen requires adjusting the configuration. The tools type can be set to code_interpreter, retrival, function calling.


llm_config = {
    "config_list": config_list,
    "assistant_id": "Your OpenAI Assistant ID ", 
    "tools": [{"type": "code_interpreter"}],
    "file_ids": [
        "Your OpenAI Assistant File 1",
        "Your OpenAI Assistant File 2"
    ],
}

AI Agent-based settings


gpt_assistant = GPTAssistantAgent(
    name="Your Assistant Agent Name", instructions="Your Assistant Agent Instructions ", llm_config=llm_config
)

If you want to try running the contents of this repo, please visit

https://github.com/kinfey/AutoGenDemo/blob/main/agent_demo_step03.ipynb

Build a visualization solution for AutoGen - AutoGen Studio

For enterprise solutions, more people like to use a combination of visualization and low-code methods to complete relevant workflow settings. Use AutoGen Studio to bring better workflow-based agent-customized visualization solutions to enterprises.

Installation

AutoGen Studio is recommended to be started in the Python 3.11 environment. You can use conda to install the Python environment and install the AutoGen Studio package.


conda create -n agstudioenv python=3.11.7

conda activate agstudioenv

pip install autogenstudio

Remember to configure OPENAI_API_KEY or your AZURE_OPENAI_API_KEY before starting


export OPENAI_API_KEY='Your OpenAI Key'

export AZURE_OPENAI_API_KEY='Your Azure OpenAI Service Key'

Start your AutoGen Studio, where port is the network port and can be set as needed


autogenstudio ui --port 8088

Use Case/Scenario

Everyone knows that I am a Premier League fan. I hope to build an AI agent to help me analyze the situation of each Premier League team in the new season based on the standings.

Assemble ammunition for your AI agent

AutoGen Studio now supports configuring skills, models, agents, and workflows. These four functions can be seen by selecting the Build menu.

1. Skills Different functions can be added to the agent through Python. Here I add a get_league_standing Skills.

Note: You need to register https://www.football-data.org/ to get API Key


import requests
import json

def get_league_standings(api_key='Your football-data API Key'):
    url = "http://api.football-data.org/v4/competitions/PL/standings"
    headers = {"X-Auth-Token": api_key}
    response = requests.get(url, headers=headers)
    data = response.json()

    standings = []  

    if 'standings' in data:
        for standing in data['standings']:
            if standing['type'] == 'TOTAL':  
                for team in standing['table']:
                    team_data = {
                        "position": team['position'],
                        "teamName": team['team']['name'],
                        "playedGames": team['playedGames'],
                        "won": team['won'],
                        "draw": team['draw'],
                        "lost": team['lost'],
                        "points": team['points'],
                        "goalsFor": team['goalsFor'],
                        "goalsAgainst": team['goalsAgainst'],
                        "goalDifference": team['goalDifference']
                    }
                    standings.append(team_data)
                break  

        standings_json = json.dumps(standings, ensure_ascii=False, indent=4)
        return standings_json
    else:
        return "Error"

After saving, as shown in the figure

2. Models corresponds to the binding of the LLMs. The design needs to set the Key of OpenAI or Azure OpenAI Service before starting. We add a binding of the gpt-4-turbo model. Here we use Azure OpenAI Services service, so the name and EndPoint of the deployment need to correspond one-to-one with your Azure OpenAI Service

3. Agents Add your AI agent. You can set different agents here. In our case, we only need to set up a single agent. Add a football_expert_assistant agent here and set a role for the system and bind the Skill - get_league_standing and Models just added, as shown in the figure.

4. Workflows We can set the workflow of the agent and the interactive dialogue workflow of the agent. We set the simplest two agent interaction mode - "Two Agents."

We need to set up the receiver - Receiver and bind the set football_expert_assistant's agent and LLMs.

Running Your Agents

You can run your application through the Playground in the AutoGen Studio UI. You only need to create a Session to correspond to the set Workflows.

The result:

Of course, you can also publish the agent by selecting Session, which can be viewed through the Gallery menu.

Summary

AutoGen is a relatively comprehensive AI agent framework. For enterprises that want to build AI agents, it not only provides an application framework, but also provides a visual and interactive visual UI of AutoGen, which lowers the entry barrier for intelligent agents and allows more people to take advantage of intelligent agents. We have taken the first step to build an AI agent using AutoGen, and will incorporate more advanced content in the following series.