Tracing your Semantic Kernel Agents with Azure AI Foundry

Microsoft

Apr 23, 2025

Many of us have encountered questions about monitoring Semantic Kernel Agents. As developers, we want to understand several aspects: the prompts sent by the Kernel to the AI Service, the behind-the-scenes processes when the Kernel calls the functions we added as plugins, and the token usage during the communication between the AI Service and the Kernel. These are all excellent questions that boil down to how we can observe Semantic Kernel. We can start answering these questions with the use of Azure AI Foundry. So let's dive into it!

Adding the Azure AI Inference Connector to the Kernel

The key aspect is to replace the chat completion service that we normally add to the Kernel, by the Azure AI Inference connector from the Azure Inference Client Library.

This connector will automatically send traces that can be visualized on the Azure AI Foundry Tracing UI, provided that you would have connected an Azure Application Insights resource to your Azure AI Foundry project. Let's see the relevant part of the code.

from semantic_kernel.connectors.ai.azure_ai_inference import (
    AzureAIInferenceChatPromptExecutionSettings,
    AzureAIInferenceChatCompletion,
)

class Agent:
    def __init__(self, endpoint: str) -> None:
        self.endpoint = endpoint

    @staticmethod
    def _create_kernel_with_chat_completion(
        agent_name: str, endpoint: str, deployment_name: str
    ) -> Kernel:
        kernel = Kernel()

        chat_completion_service = AzureAIInferenceChatCompletion(
            service_id=agent_name,
            ai_model_id=deployment_name,
            client=ChatCompletionsClient(
                endpoint=f"{str(endpoint)}/openai/deployments/{deployment_name}",
                credential=DefaultAzureCredential(),
                credential_scopes=["https://cognitiveservices.azure.com/.default"],
                api_version=os.environ["AZURE_OPENAI_API_VERSION"],
            ),
        )

        kernel.add_service(chat_completion_service)

        return kernel

    def create_agent(
        self, agent_name: str, deployment_name:str, instructions: str, tools: List[Any] = []
    ) -> ChatCompletionAgent:
        kernel = self._create_kernel_with_chat_completion(
            agent_name, self.endpoint, deployment_name
        )

        # Add tools to the kernel
        for tool in tools:
            if isinstance(tool, SearchService):
                kernel.add_plugin(tool, plugin_name="search")
            elif isinstance(tool, BingConnector):
                kernel.add_plugin(tool, plugin_name="bing")
            elif isinstance(tool, LogicAppPlugin):
                kernel.add_plugin(tool, plugin_name="get_blob_info")
            else:
                raise ValueError(f"Unsupported tool type: {type(tool)}")

        # Set up execution settings
        settings = kernel.get_prompt_execution_settings_from_service_id(
            service_id=agent_name
        )
        settings.function_choice_behavior = FunctionChoiceBehavior.Auto()

        agent = ChatCompletionAgent(
            kernel=kernel,
            name=agent_name,
            instructions=instructions,
            arguments=KernelArguments(settings=settings),
        )

        return agent

The code defines an Agent class to integrate Azure OpenAI and other tools (like Bing, Logic Apps) using Semantic Kernel for creating AI agents with customizable instructions and tools. Notice that we are using the Azure AI Inference Chat Completion as our AI Service to enable tracing.

Observing the Kernel using AI Foundry

Voilà! With Azure AI Foundry, we can now effectively monitor and observe the Kernel app.

With the tracing, we are now able to observe:

The selection strategy (in multi agent scenario): Which agent was selected based on the evaluation of a Kernel Function.
The tools employed by the agent: Which tools were called along with the parameters that were passed to them by the agent.
The tokens consumed: How many input/output tokens were consumed by each AI model service invocation.
The termination strategy (in multi agent scenario): Which agent was selected to finish with the current interaction, based on the evaluation of a Kernel Function.
The latency: How long did each step of the process took.

Conclusion:

The Azure Inference Client Library from Semantic Kernel is the feature that connects AI models deployed by Azure AI Foundry. It supports performing inference, including chat completion, and sending traces from the Kernel to the Foundry UI. This helps us developers in monitoring and diagnosing agents and multi-agent applications, including key aspects such as which tools they employ and how many tokens they consume.

Code sample:

For more details and code samples, please see this repository

Updated Apr 24, 2025

Version 3.0

Microsoft

Joined March 27, 2024

View Profile

Microsoft Foundry Blog