autogen
11 TopicsBuilding Custom Chat AI: A Comprehensive Guide for Developers
In today's rapidly evolving digital landscape, the integration of artificial intelligence (AI) into business operations has become a pivotal strategy for companies aiming to enhance their customer engagement and streamline their processes. This article delves into the foundational steps and considerations for developers embarking on the journey of building a custom chat AI for their company website. From understanding the core concepts of AI to selecting the right models and implementing effective prompt engineering techniques, this guide provides a comprehensive overview to help developers navigate the complexities of AI development. Whether you are a beginner or have some experience in the field, the insights shared here will equip you with the knowledge and tools needed to create a robust and efficient chat AI tailored to your business needs. A discussion will be held with Nitya Narasimhan, Senior Cloud Advocate at Microsoft specializing in AI, and Wey Gu, a Chinese AI MVP, to delve into these critical topics. What are the first steps a developer should take when starting to build a custom chat AI for their company website? Nitya: If you are new to AI, start by familiarizing yourself with the core concepts and usage of AI models. A course like Generative AI for Beginnerscan be a great starting point. Next, get hands-on experience with models by trying out GitHub Models, which are free to use with just a GitHub account. This will help you build your intuition for model selection and prompt engineering. If you already have some experience, the initial steps to building a custom chat AI are as follows: Identify the use case and requirements (e.g., typical questions asked and valid responses). Choose a model to start prototyping (test the question with various models and compare results). If your chat AI is grounded in your data, identify the data sources and formats (where and what). Select an AI app template to jumpstart development and customize it with your model and data choices. How does understanding model choice impact the development of a custom chat AI? Nitya: Understanding model choice is crucial for developing a custom chat AI. It involves evaluating models based on three key factors: cost, customization, and performance. Customization: Start by identifying the task you want to execute (e.g., chat, image, embeddings, agents). Filter models that support this capability and validate them with a test prompt to ensure they fit your requirements. This process will narrow down your options from thousands to a few suitable models. Cost: Consider whether the model supports serverless deployments (pay-as-you-go, per token) or managed deployments (subscription-based, per VM). Evaluate costs not just for usage (chat completion) but also for end-to-end development (evaluations, iterative ideation). Performance: Assess models based on latency (e.g., chat completions vs. reasoning models) and the quality and safety of responses. Understand default model characteristics (model card) and perform custom evaluations to ensure quality for your desired prompts dataset. Can you explain the concept of prompt engineering and how it can be applied using GitHub models? Nitya: Prompt engineering involves guiding the model on how to process questions and generate responses to improve quality. Think of developers as teachers and models as students being taught to answer exam questions. Prompt engineering provides a rubric to guide models in giving relevant answers. This includes providing examples, creating personas (e.g., "answer politely using formal language"), defining output formats (e.g., "answer in 1-2 sentences", "reply with results in JSON format"), and configuring model parameters (e.g., temperature, stop-words, top-p, max tokens). When working with GitHub models, you can configure models using the Playground (UI) or move to an IDE with the Azure AI Inference API, offering both low-code and code-first options for prompt engineering. What is retrieval augmented generation (RAG), and how does it enhance the ability to chat with data? Wei: RAG involves grounding user questions in retrieved knowledge from private data sources to ensure responses are relevant to the application scenario. It works by wrapping the initial user prompt in a prompt template to create the final model prompt sent to the model. The RAG workflow includes retrieval of knowledge, augmentation of the prompt, and generation of the response. This dynamic process provides relevant grounding data and instructions to contextualize user questions for app-required responses. What are some practical tips for developers to streamline their end-to-end journey from catalog to cloud? Nitya: Here are three tips to get started: Model Selection: Use GitHub Models with diverse test prompts to build intuition for prompt engineering and model capabilities. Compare models side-by-side. Copilot Development: Start with an Azure AI app template. Deploy it to understand the application and its architecture before customizing it to your needs. Validate your development environment and get familiar with tools. Safety & Evaluation: Explore built-in content safety filters and evaluators in the Azure AI platform to understand metrics and effectiveness of your prompt engineering or RAG strategy. Use tracing and App Insights to monitor performance and cost. What are some common challenges developers might face when building a custom chat AI, and how can they overcome them? Nitya and Wei: There are many challenges we can think of - here are three that are important: App Architecture: Understand the app architecture for your scenario (e.g., RAG, multi-agent). Explore existing AI app templates to build intuition and customize one that fits your requirements. Model Choice: Choose models based on cost, quota availability, and flexibility for future configuration. Use the Azure AI model inference API to abstract provider-specific SDKs and decouple your code from your choice, allowing for easier model swaps later. Observability: Debug issues in app development or execution performance. Use platforms and tools that bring observability to the end-to-end workflow. Activate App Insights and use tracing tools to generate telemetry for insights locally or in production. What resources and samples are available for further exploration into this subject? Wei: Explore Azure AI App Templates, For Beginners Curricula, RAG Chat Workshop, AI Tour Workshops and Generative AI for Beginners. For more workshops and talks, visit https://aka.ms/aitour/repos. Feel free to check out opensource projects like AutoGen, LlamaIndex, LangChain, and CamelAI's documentation. As we conclude this exploration into building a custom chat AI for your company website, it's clear that the journey is both challenging and rewarding. By understanding the core concepts of AI, selecting the right models, and mastering prompt engineering, developers can create a powerful tool that enhances customer engagement and streamlines business operations. The insights and practical tips shared in this article provide a solid foundation for embarking on this journey. Remember, the key to success lies in continuous learning and adaptation. As AI technology evolves, you should also adapt your approach to developing and refining your chat AI. Stay curious, stay innovative, and most importantly, stay committed to delivering the best possible experience for your users.234Views1like0CommentsAI Agents: Planning and Orchestration with the Planning Design Pattern - Part 7
This blog post, Part 7 in a series on AI agents, focuses on the Planning Design Pattern for effective task orchestration. It explains how to define clear goals, decompose complex tasks into manageable subtasks, and leverage structured output (e.g., JSON) for seamless communication between agents. The post includes code snippets demonstrating how to create a planning agent, orchestrate multi-agent workflows, and implement iterative planning for dynamic adaptation. It also links to a practical example notebook (07-autogen.ipynb) and further resources like AutoGen Magnetic One, encouraging readers to explore advanced planning concepts. Links to the previous posts in the series are provided for easy access to foundational AI agent concepts.1.6KViews1like0CommentsHow to use any Python AI agent framework with free GitHub Models
I ❤️ when companies offer free tiers for developer services, since it gives everyone a way to learn new technologies without breaking the bank. Free tiers are especially important for students and people between jobs, when the desire to learn is high but the available cash is low. That's why I'm such a fan of GitHub Models: free, high-quality generative AI models available to anyone with a GitHub account. The available models include the latest OpenAI LLMs (like o3-mini), LLMs from the research community (like Phi and Llama), LLMs from other popular providers (like Mistral and Jamba), multimodal models (like gpt-4o and llama-vision-instruct) and even a few embedding models (from OpenAI and Cohere). With access to such a range of models, you can prototype complex multi-model workflows to improve your productivity or heck, just make something fun for yourself. 🤗 To use GitHub Models, you can start off in no-code mode: open the playground for a model, send a few requests, tweak the parameters, and check out the answers. When you're ready to write code, select "Use this model". A screen will pop up where you can select a programming language (Python/JavaScript/C#/Java/REST) and select an SDK (which varies depending on model). Then you'll get instructions and code for that model, language, and SDK. But here's what's really cool about GitHub Models: you can use them with all the popular Python AI frameworks, even if the framework has no specific integration with GitHub Models. How is that possible? The vast majority of Python AI frameworks support the OpenAI Chat Completions API, since that API became a defacto standard supported by many LLM API providers besides OpenAI itself. GitHub Models also provide OpenAI-compatible endpoints for chat completion models. Therefore, any Python AI framework that supports OpenAI-like models can be used with GitHub Models as well. 🎉 To prove it, I've made a new repository with examples from eight different Python AI agent packages, all working with GitHub Models: python-ai-agent-frameworks-demos. There are examples for AutoGen, LangGraph, Llamaindex, OpenAI Agents SDK, OpenAI standard SDK, PydanticAI, Semantic Kernel, and SmolAgents. You can open that repository in GitHub Codespaces, install the packages, and get the examples running immediately. Now let's walk through the API connection code for GitHub Models for each framework. Even if I missed your favorite framework, I hope my tips here will help you connect any framework to GitHub Models. OpenAI I'll start with openai , the package that started it all! import openai client = openai.OpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") The code above demonstrates the two key parameters we'll need to configure for all frameworks: api_key : When using OpenAI.com, you pass your OpenAI API key here. When using GitHub Models, you pass in a Personal Access Token (PAT). If you open the repository (or any repository) in GitHub Codespaces, a PAT is already stored in the GITHUB_TOKEN environment variable. However, if you're working locally with GitHub Models, you'll need to generate a PAT yourself and store it. PATs expire after a while, so you need to generate new PATs every so often. base_url : This parameter tells the OpenAI client to send all requests to "https://models.inference.ai.azure.com" instead of the OpenAI.com API servers. That's the domain that hosts the OpenAI-compatible endpoint for GitHub Models, so you'll always pass that domain as the base URL. If we're working with the new openai-agents SDK, we use very similar code, but we must use the AsyncOpenAI client from openai instead. Lately, Python AI packages are defaulting to async, because it's so much better for performance. import agents import openai client = openai.AsyncOpenAI( base_url="https://models.inference.ai.azure.com", api_key=os.environ["GITHUB_TOKEN"]) model = agents.OpenAIChatCompletionsModel( model="gpt-4o", openai_client=client) spanish_agent = agents.Agent( name="Spanish agent", instructions="You only speak Spanish.", model=model) PydanticAI Now let's look at all of the packages that make it really easy for us, by allowing us to directly bring in an instance of either OpenAI or AsyncOpenAI . For PydanticAI, we configure an AsyncOpenAI client, then construct an OpenAIModel object from PydanticAI, and pass that model to the agent: import openai import pydantic_ai import pydantic_ai.models.openai client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") model = pydantic_ai.models.openai.OpenAIModel( "gpt-4o", provider=OpenAIProvider(openai_client=client)) spanish_agent = pydantic_ai.Agent( model, system_prompt="You only speak Spanish.") Semantic Kernel For Semantic Kernel, the code is very similar. We configure an AsyncOpenAI client, then construct an OpenAIChatCompletion object from Semantic Kernel, and add that object to the kernel. import openai import semantic_kernel.connectors.ai.open_ai import semantic_kernel.agents chat_client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") chat = semantic_kernel.connectors.ai.open_ai.OpenAIChatCompletion( ai_model_id="gpt-4o", async_client=chat_client) kernel.add_service(chat) spanish_agent = semantic_kernel.agents.ChatCompletionAgent( kernel=kernel, name="Spanish agent" instructions="You only speak Spanish") AutoGen Next, we'll check out a few frameworks that have their own wrapper of the OpenAI clients, so we won't be using any classes from openai directly. For AutoGen, we configure both the OpenAI parameters and the model name in the same object, then pass that to each agent: import autogen_ext.models.openai import autogen_agentchat.agents client = autogen_ext.models.openai.OpenAIChatCompletionClient( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") spanish_agent = autogen_agentchat.agents.AssistantAgent( "spanish_agent", model_client=client, system_message="You only speak Spanish") LangGraph For LangGraph, we configure a very similar object, which even has the same parameter names: import langchain_openai import langgraph.graph model = langchain_openai.ChatOpenAI( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com", ) def call_model(state): messages = state["messages"] response = model.invoke(messages) return {"messages": [response]} workflow = langgraph.graph.StateGraph(MessagesState) workflow.add_node("agent", call_model) SmolAgents Once again, for SmolAgents, we configure a similar object, though with slightly different parameter names: import smolagents model = smolagents.OpenAIServerModel( model_id="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = smolagents.CodeAgent(model=model) Llamaindex I saved Llamaindex for last, as it is the most different. The llama-index package has a different constructor for OpenAI.com versus OpenAI-like servers, so I opted to use that OpenAILike constructor instead. However, I also needed an embeddings model for my example, and the package doesn't have an OpenAIEmbeddingsLike constructor, so I used the standard OpenAIEmbedding constructor. import llama_index.embeddings.openai import llama_index.llms.openai_like import llama_index.core.agent.workflow Settings.llm = llama_index.llms.openai_like.OpenAILike( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com", is_chat_model=True) Settings.embed_model = llama_index.embeddings.openai.OpenAIEmbedding( model="text-embedding-3-small", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = llama_index.core.agent.workflow.ReActAgent( tools=query_engine_tools, llm=Settings.llm) Choose your models wisely! In all of the examples above, I specified the gpt-4o model. The gpt-4o model is a great choice for agents because it supports function calling, and many agent frameworks only work (or work best) with models that natively support function calling. Fortunately, GitHub Models includes multiple models that support function calling, at least in my basic experiments: gpt-4o gpt-4o-mini o3-mini AI21-Jamba-1.5-Large AI21-Jamba-1.5-Mini Codestral-2501 Cohere-command-r Ministral-3B Mistral-Large-2411 Mistral-Nemo Mistral-small You might find that some models work better than others, especially if you're using agents with multiple tools. With GitHub Models, it's very easy to experiment and see for yourself, by simply changing the model name and re-running the code. Join the AI Agents Hackathon We are currently running a free virtual hackathon from April 8th - 30th, to challenge developers to create agentic applications using Microsoft technologies. You could build an agent entirely using GitHub Models and submit it to the hackathon for a chance to win amazing prizes! You can also join our 30+ streams about building AI agents, including a stream all about prototyping with GitHub Models. Learn more and register at https://aka.ms/agentshack2KViews3likes0CommentsAI Agents: Building Trustworthy Agents- Part 6
This blog post, Part 6 in a series on AI agents, focuses on building trustworthy AI agents. It emphasizes the importance of safety and security in agent design and deployment. The post details a system message framework for creating robust and scalable prompts, outlining a four-step process from meta prompt to iterative refinement. It then explores various threats to AI agents, including task manipulation, unauthorized access, resource overloading, knowledge base poisoning, and cascading errors, providing mitigation strategies for each. The post also highlights the human-in-the-loop approach for enhanced trust and control, providing a code example using AutoGen. Finally, it links to further resources on responsible AI, model evaluation, and risk assessment, along with the previous posts in the series.667Views3likes0CommentsThe Launch of "AI Agents for Beginners": Your Gateway to Building Intelligent Systems
🌱 Getting Started Each lesson covers fundamental aspects of building AI Agents. Whether you're a novice or have some experience, you'll find valuable insights and practical knowledge. We also support multiple languages, so you can learn in your preferred language. To see the available languages, click here. If this is your first time working with Generative AI models, we highly recommend our "Generative AI For Beginners" course, which includes 21 lessons on building with GenAI. Remember to star (🌟) this repository and fork it to run the code! 📋 What You Need The course includes code examples that you can find in the code_samples folder. Feel free to fork this repository to create your own copy. The exercises utilize Azure AI Foundry and GitHub Model Catalogs for interacting with Language Models: Github Models - Free / Limited Azure AI Foundry - Azure Account Required We also leverage the following AI Agent frameworks and services from Microsoft: Azure AI Agent Service Semantic Kernel AutoGen For more information on running the code for this course, visit the Course Setup. 🙏 Want to Help? We welcome contributions from the community! If you have suggestions or spot any errors, please raise an issue or create a pull request. If you encounter any difficulties or have questions about building AI Agents, join our Azure AI Community on Discord. 📂 Each Lesson Includes A written lesson located in the README (Videos Coming March 2025) Python code samples supporting Azure AI Foundry and Github Models (Free) Links to extra resources to continue your learning 🗃️ Lessons Overview Intro to AI Agents and Use Cases Exploring Agentic Frameworks Understanding Agentic Design Patterns Tool Use Design Pattern Agentic RAG Building Trustworthy AI Agents Planning Design Pattern Multi-Agent Design Pattern Metacognition Design Pattern AI Agents in Production 🌐 Multi-Language Support We offer translations in several languages and will updating these on a regular basis. 🚀 Go Fork or Clone this repo and get started on your AI Agents journey 🤖 at https://aka.ms/ai-agents-beginners15KViews3likes4CommentsBuilding AI Agent Applications Series - Using AutoGen to build your AI Agents
In the previous content, we learned about AI Agent. If you didn't read it, please read my previous content - Understanding AI Agents. We have many different frameworks to implement AI Agents. AutoGen from Microsoft is a relatively mature AI Agents framework. Now AutoGen is mainly based on two programming languages .NET and Python. The more mature version is the Python version. The content in this article is mainly based on the Python version https://microsoft.github.io/autogen. If you want to learn the .NET version, you can visit here https://microsoft.github.io/autogen-for-net41KViews1like3CommentsMicrosoft Semantic Kernel and AutoGen: Open Source Frameworks for AI Solutions
Explore Microsoft’s open-source frameworks, Semantic Kernel and AutoGen. Semantic Kernel enables developers to create AI solutions across various domains using a single Large Language Model (LLM). AutoGen, on the other hand, uses AI Agents to perform smart tasks through agent dialogues. Discover how these technologies serve different scenarios and can be used to build powerful AI applications.47KViews6likes1CommentDocAider: Automated Documentation Maintenance for Open-source GitHub Repositories
Code–level documentation of a software system provides explanations of the code functionality and usages. Documentation is crucial for giving clear insights into the code for end–users and future developers. However, creating and updating documentation manually is a demanding task, requiring significant resources and labour. With the advancement of generative AI, there is a potential to reduce human labour in documentation tasks significantly. We propose DocAider, an automation tool powered by GPT–4 that integrates the processes of documentation generation and update. DocAider can generate comprehensive and structured documentation in markdown format and update it in response to any changes made in pull requests. The mission of DocAider is to reduce developers’ burden on maintaining documentation for GitHub repositories.5.1KViews2likes0CommentsBuilding AI Agent Applications Series - Assembling your AI agent with the Semantic Kernel
In the previous series of articles, we learned about the basic concepts of AI agents and how to use AutoGen or Semantic Kernel combined with the Azure OpenAI Service Assistant API to build AI agent applications. For different scenarios and workflows, powerful tools need to be assembled to support the operation of the AI agent. If you only use your own tool chain in the AI agent framework to solve enterprise workflow, it will be very limited. AutoGen supports defining tool chains through Function Calling, and developers can define different methods to assemble extended business work chains. As mentioned before, Semantic Kernel has good business-based plug-in creation, management and engineering capabilities. Through AutoGen + Semantic Kernel, powerful AI agent solutions can be built.5.3KViews1like1Comment