Building AI Agent Applications Series - Assembling your AI agent with the Semantic Kernel

Microsoft

Feb 17, 2024

In the previous series of articles, we learned about the basic concepts of AI agents and how to use AutoGen or Semantic Kernel combined with the Azure OpenAI Service Assistant API to build AI agent applications. For different scenarios and workflows, powerful tools need to be assembled to support the operation of the AI agent. If you only use your own tool chain in the AI agent framework to solve enterprise workflow, it will be very limited. AutoGen supports defining tool chains through Function Calling, and developers can define different methods to assemble extended business work chains. As mentioned before, Semantic Kernel has good business-based plug-in creation, management and engineering capabilities. Through AutoGen + Semantic Kernel, powerful AI agent solutions can be built.

Scenario 1 - Constructing a single AI agent for writing technical blogs

As a cloud advocate, I often need to write some technical blogs. In the past, I needed a lot of supporting materials. Although I could write some of the materials through Prompt + LLMs, some professional content might not be enough to meet the requirements. For example, I want to write based on the recorded YouTube video and the syllabus. As shown in the picture above, combine the video script and outline around the three questions as basic materials, and then start writing the blog.

Note: We need to save the data as vector first. There are many methods. You can choose to use different frameworks for embedded vector processing. Here we use Semantic Kernel combined with Qdrant. Of course, the more ideal step is to add this part to the entire technical blog writing agent, which we will introduce in the next scenario.

Because the AI agent simulates human behavior, when designing the AI agent, the steps that need to be set are the same as in my daily work.

Find relevant content based on the question
Set a blog title, extended content and related guidance, and write it in markdown
Save

We can complete steps 1 and 2 through Semantic Kernel. As for step 3, we can directly use the traditional method of reading and writing files. We need to define these three functions ask, writeblog, and saveblog here. After completion, we need to configure Function Calling and set the parameters and function names corresponding to these three functions.


llm_config={
    "config_list": config_list,
    "functions": [
        {
            "name": "ask",
            "description": "ask question about Machine Learning,  get basic knowledge",
            "parameters": {
                "type": "object",
                "properties": {
                    "question": {
                        "type": "string",
                        "description": "About Machine Learning",
                    }
                },
                "required": ["question"],
            },
        },
        {
            "name": "writeblog",
            "description": "write blogs in markdown format",
            "parameters": {
                "type": "object",
                "properties": {
                    "content": {
                        "type": "string",
                        "description": "basic content",
                    }
                },
                "required": ["content"],
            },
        },
        {
            "name": "saveblog",
            "description": "save blogs",
            "parameters": {
                "type": "object",
                "properties": {
                    "blog": {
                        "type": "string",
                        "description": "basic content",
                    }
                },
                "required": ["blog"],
            },
        }
    ],
}

Because this is a single AI agent application, we only need to define an Assistant and a UserProxy. We only need to define our goals and inform the relevant steps to run.


assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    code_execution_config=False
)

user_proxy.register_function(
    function_map={
        "ask": ask,
        "writeblog": writeblog,
        "saveblog": saveblog
    }
)


with Cache.disk():
    await user_proxy.a_initiate_chat(
        assistant,
        message="""
            I'm writing a blog about Machine Learning. Find the answers to the 3 questions below and write an introduction based on them. After preparing these basic materials, write a blog and save it.

            1. What is Machine Learning?
            2. The difference between AI and ML
            3. The history of Machine Learning

            Let's go
    """
    )

We tried running it and it worked fine. For specific effects, please refer to：

https://github.com/kinfey/AutoGenDemo/blob/main/autogenwithSK/blogs/blog-12af6af4-ba79-48a4-b1e5-b6094675c22e.md

Sample code:

https://github.com/kinfey/AutoGenDemo/blob/main/autogenwithSK/02.autogenwithsk.ipynb

Scenario 2 - Building a multi-agent interactive technical blog editor solution.

In the above scenario, we successfully built a single AI agent for technical blog writing. We hope that our technology will be more intelligent. From content search to writing and saving to translation, it is all completed through AI agent interaction. We can use different job roles to achieve this goal. This can be done by generating code from LLMs in AutoGen, but the uncertainty of this is a bit high. Therefore, it is more reliable to define additional methods through Function Calling to ensure the accuracy of the call. The following is a structural diagram of the division of labor roles:

Notice

Admin - Define various operations through UserProxy, including the most important methods.
Collector KB Assistant - Responsible for downloading relevant subtitle scripts of technical videos from YouTube, saving them locally, and vectorizing them by extracting different knowledge points and saving them to the vector database. Here I only made a video subtitle script. You can also add local documents and support for different types of audio files.
Blog Editor Assistant - When the data collection assistant completes its work, it can hand over the work to the blog writing assistant, who will write the blog as required based on a simple question outline (title setting, content expansion, and usage markdown format, etc.), and automatically save the blog to the local after writing.
Translation Assistant - Responsible for the translation of blogs in different languages. What I am talking about here is translating Chinese (can be expanded to support more languages)

Based on the above division of labor, we need to define different methods to support it. At this time, we can use SK to complete related operations.

Here we use AutoGen's group chat mode to complete related blog work. You can clearly see that you have a team working, which is also the charm of the agent. Set it up with the following code.


groupchat = autogen.GroupChat(
    agents=[user_proxy, collect_kb_assistant, blog_editor_assistant,translate_assistant], messages=[],max_round=30)

manager = autogen.GroupChatManager(groupchat=groupchat, llm_config={'config_list': config_list})
"""
    )

The code for group chat dispatch is as follows:


await user_proxy.a_initiate_chat(
    manager,
    message="""
            Use this link https://www.youtube.com/watch?v=1qs6QKk0DVc as knowledge with collect knowledge assistant. Find the answers to the 3 questions below to write blog  and save and save this blog to local file with blog editor assistant. And translate this blog to Chinese with translate assistant.

            1. What is GitHub Copilot ?
            2. How to Install GitHub Copilot ?
            3. Limitations of GitHub Copilot

           Let's go
"""
    )

Different from a single AI agent, a manager is configured to coordinate the communication work of multiple AI agents. Of course, you also need to have clear instructions to assign work.

You can view the complete code on this Repo.

https://github.com/kinfey/AutoGenDemo/blob/main/autogenwithSK/03.autogenwithsk_groupchat.ipynb

If you want to see the result about English blog, you can also click this link.

https://github.com/kinfey/AutoGenDemo/blob/main/autogenwithSK/blogs/blog-28bdf106-b234-4559-b8e8-4f7dbd0fcc36.md

If you want to see the result about Chinese blog, you can also click this link.

https://github.com/kinfey/AutoGenDemo/blob/main/autogenwithSK/blogs/zh-blogs-f55e2e3e-292b-4028-8b7b-1b4dde9fc819.md

AutoGen helps us easily define different AI agents and plan how different AI agents interact and operate. The Semantic Kernel is more like a middle layer to help support different ways for agents to solve tasks, which will be of great help to enterprise scenarios. When AutoGen appears, some people may think that it overlaps with Semantic Kernel in many places. In fact, it complements and does not replace it. With the arrival of the Azure OpenAI Service Assistant API, you can believe that the agent will have stronger capabilities as the technical framework and API are improved.