ai
17 TopicsHow to use any Python AI agent framework with free GitHub Models
I ❤️ when companies offer free tiers for developer services, since it gives everyone a way to learn new technologies without breaking the bank. Free tiers are especially important for students and people between jobs, when the desire to learn is high but the available cash is low. That's why I'm such a fan of GitHub Models: free, high-quality generative AI models available to anyone with a GitHub account. The available models include the latest OpenAI LLMs (like o3-mini), LLMs from the research community (like Phi and Llama), LLMs from other popular providers (like Mistral and Jamba), multimodal models (like gpt-4o and llama-vision-instruct) and even a few embedding models (from OpenAI and Cohere). With access to such a range of models, you can prototype complex multi-model workflows to improve your productivity or heck, just make something fun for yourself. 🤗 To use GitHub Models, you can start off in no-code mode: open the playground for a model, send a few requests, tweak the parameters, and check out the answers. When you're ready to write code, select "Use this model". A screen will pop up where you can select a programming language (Python/JavaScript/C#/Java/REST) and select an SDK (which varies depending on model). Then you'll get instructions and code for that model, language, and SDK. But here's what's really cool about GitHub Models: you can use them with all the popular Python AI frameworks, even if the framework has no specific integration with GitHub Models. How is that possible? The vast majority of Python AI frameworks support the OpenAI Chat Completions API, since that API became a defacto standard supported by many LLM API providers besides OpenAI itself. GitHub Models also provide OpenAI-compatible endpoints for chat completion models. Therefore, any Python AI framework that supports OpenAI-like models can be used with GitHub Models as well. 🎉 To prove it, I've made a new repository with examples from eight different Python AI agent packages, all working with GitHub Models: python-ai-agent-frameworks-demos. There are examples for AutoGen, LangGraph, Llamaindex, OpenAI Agents SDK, OpenAI standard SDK, PydanticAI, Semantic Kernel, and SmolAgents. You can open that repository in GitHub Codespaces, install the packages, and get the examples running immediately. Now let's walk through the API connection code for GitHub Models for each framework. Even if I missed your favorite framework, I hope my tips here will help you connect any framework to GitHub Models. OpenAI I'll start with openai , the package that started it all! import openai client = openai.OpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") The code above demonstrates the two key parameters we'll need to configure for all frameworks: api_key : When using OpenAI.com, you pass your OpenAI API key here. When using GitHub Models, you pass in a Personal Access Token (PAT). If you open the repository (or any repository) in GitHub Codespaces, a PAT is already stored in the GITHUB_TOKEN environment variable. However, if you're working locally with GitHub Models, you'll need to generate a PAT yourself and store it. PATs expire after a while, so you need to generate new PATs every so often. base_url : This parameter tells the OpenAI client to send all requests to "https://models.inference.ai.azure.com" instead of the OpenAI.com API servers. That's the domain that hosts the OpenAI-compatible endpoint for GitHub Models, so you'll always pass that domain as the base URL. If we're working with the new openai-agents SDK, we use very similar code, but we must use the AsyncOpenAI client from openai instead. Lately, Python AI packages are defaulting to async, because it's so much better for performance. import agents import openai client = openai.AsyncOpenAI( base_url="https://models.inference.ai.azure.com", api_key=os.environ["GITHUB_TOKEN"]) model = agents.OpenAIChatCompletionsModel( model="gpt-4o", openai_client=client) spanish_agent = agents.Agent( name="Spanish agent", instructions="You only speak Spanish.", model=model) PydanticAI Now let's look at all of the packages that make it really easy for us, by allowing us to directly bring in an instance of either OpenAI or AsyncOpenAI . For PydanticAI, we configure an AsyncOpenAI client, then construct an OpenAIModel object from PydanticAI, and pass that model to the agent: import openai import pydantic_ai import pydantic_ai.models.openai client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") model = pydantic_ai.models.openai.OpenAIModel( "gpt-4o", provider=OpenAIProvider(openai_client=client)) spanish_agent = pydantic_ai.Agent( model, system_prompt="You only speak Spanish.") Semantic Kernel For Semantic Kernel, the code is very similar. We configure an AsyncOpenAI client, then construct an OpenAIChatCompletion object from Semantic Kernel, and add that object to the kernel. import openai import semantic_kernel.connectors.ai.open_ai import semantic_kernel.agents chat_client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") chat = semantic_kernel.connectors.ai.open_ai.OpenAIChatCompletion( ai_model_id="gpt-4o", async_client=chat_client) kernel.add_service(chat) spanish_agent = semantic_kernel.agents.ChatCompletionAgent( kernel=kernel, name="Spanish agent" instructions="You only speak Spanish") AutoGen Next, we'll check out a few frameworks that have their own wrapper of the OpenAI clients, so we won't be using any classes from openai directly. For AutoGen, we configure both the OpenAI parameters and the model name in the same object, then pass that to each agent: import autogen_ext.models.openai import autogen_agentchat.agents client = autogen_ext.models.openai.OpenAIChatCompletionClient( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") spanish_agent = autogen_agentchat.agents.AssistantAgent( "spanish_agent", model_client=client, system_message="You only speak Spanish") LangGraph For LangGraph, we configure a very similar object, which even has the same parameter names: import langchain_openai import langgraph.graph model = langchain_openai.ChatOpenAI( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com", ) def call_model(state): messages = state["messages"] response = model.invoke(messages) return {"messages": [response]} workflow = langgraph.graph.StateGraph(MessagesState) workflow.add_node("agent", call_model) SmolAgents Once again, for SmolAgents, we configure a similar object, though with slightly different parameter names: import smolagents model = smolagents.OpenAIServerModel( model_id="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = smolagents.CodeAgent(model=model) Llamaindex I saved Llamaindex for last, as it is the most different. The llama-index package has a different constructor for OpenAI.com versus OpenAI-like servers, so I opted to use that OpenAILike constructor instead. However, I also needed an embeddings model for my example, and the package doesn't have an OpenAIEmbeddingsLike constructor, so I used the standard OpenAIEmbedding constructor. import llama_index.embeddings.openai import llama_index.llms.openai_like import llama_index.core.agent.workflow Settings.llm = llama_index.llms.openai_like.OpenAILike( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com", is_chat_model=True) Settings.embed_model = llama_index.embeddings.openai.OpenAIEmbedding( model="text-embedding-3-small", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = llama_index.core.agent.workflow.ReActAgent( tools=query_engine_tools, llm=Settings.llm) Choose your models wisely! In all of the examples above, I specified the gpt-4o model. The gpt-4o model is a great choice for agents because it supports function calling, and many agent frameworks only work (or work best) with models that natively support function calling. Fortunately, GitHub Models includes multiple models that support function calling, at least in my basic experiments: gpt-4o gpt-4o-mini o3-mini AI21-Jamba-1.5-Large AI21-Jamba-1.5-Mini Codestral-2501 Cohere-command-r Ministral-3B Mistral-Large-2411 Mistral-Nemo Mistral-small You might find that some models work better than others, especially if you're using agents with multiple tools. With GitHub Models, it's very easy to experiment and see for yourself, by simply changing the model name and re-running the code. Join the AI Agents Hackathon We are currently running a free virtual hackathon from April 8th - 30th, to challenge developers to create agentic applications using Microsoft technologies. You could build an agent entirely using GitHub Models and submit it to the hackathon for a chance to win amazing prizes! You can also join our 30+ streams about building AI agents, including a stream all about prototyping with GitHub Models. Learn more and register at https://aka.ms/agentshack1.7KViews3likes0CommentsAI Agents: Building Trustworthy Agents- Part 6
This blog post, Part 6 in a series on AI agents, focuses on building trustworthy AI agents. It emphasizes the importance of safety and security in agent design and deployment. The post details a system message framework for creating robust and scalable prompts, outlining a four-step process from meta prompt to iterative refinement. It then explores various threats to AI agents, including task manipulation, unauthorized access, resource overloading, knowledge base poisoning, and cascading errors, providing mitigation strategies for each. The post also highlights the human-in-the-loop approach for enhanced trust and control, providing a code example using AutoGen. Finally, it links to further resources on responsible AI, model evaluation, and risk assessment, along with the previous posts in the series.607Views3likes0CommentsFix Broken Migrations with AI Powered Debugging in VS Code Using GitHub Copilot
Data is at the heart of every application. But evolving your schema is risky business. One broken migration, and your dev or prod environment can go down. We've all experienced it: mismatched columns, orphaned constraints, missing fields, or that dreaded "table already exists" error. But what if debugging migrations didn’t have to be painful? What if you could simply describe the error or broken state, and AI could fix your migration in seconds? In this blog, you’ll learn how to: Use GitHub Copilot to describe and fix broken migrations with natural language Catch schema issues like incorrect foreign keys before they block your workflow Validate and deploy your database changes using GibsonAI CLI Broken migrations are nothing new. Whether you're working on a side project or part of a large team, it’s all too easy to introduce schema issues that can block deployments or corrupt local environments. Traditionally, fixing them means scanning SQL files, reading error logs, and manually tracking down what went wrong. But what if you could skip all that? What if you could simply describe the issue in plain English and AI would fix it for you? That’s exactly what GitHub Copilot let you do, right from within VS Code. What You Need: Visual Studio Code Installed Account in GitHub Sign up with GitHub Copilot GibsonAI CLI installed and logged in Let’s Break (and Fix) a Migration: Here’s a common mistake. Say you create two tables: users and posts. CREATE TABLE users ( id UUID PRIMARY KEY, name TEXT, email TEXT UNIQUE ); CREATE TABLE posts ( id UUID PRIMARY KEY, title TEXT, user_id UUID REFERENCES user(id) ); The problem? The posts table refers to a table called user, but you named it users. This one-word mistake breaks the migration. If you've worked with relational databases, you’ve probably run into this exact thing. Just Ask a GitHub Copilot: Instead of troubleshooting manually, open Copilot Chat and ask: “My migration fails because posts.user_id references a missing user table. Can you fix the foreign key?” Copilot understands what you're asking. It reads the context and suggests the fix: CREATE TABLE posts ( id UUID PRIMARY KEY, title TEXT, user_id UUID REFERENCES users(id) ); It even explains what changed, so you learn along the way. Wait — how does Copilot know what I mean? GitHub Copilot is smart enough to understand your code, your errors, and even what you’re asking in plain English. It doesn’t directly connect to GibsonAI. You’ll use the GibsonAI CLI for that, but Copilot helps you figure things out and fix your code faster. Validating with GibsonAI Once Copilot gives you the fixed migration, it’s time to test it. Run: gibson validate This checks your migration and schema consistency. When you're ready to apply it, just run: gibson deploy GibsonAI handles the rest so no broken chains, no surprises. Why This Works Manual debugging of migrations is frustrating and error prone. GibsonAI with GitHub Copilot: Eliminates guesswork in debugging You don’t need to Google every error Reduces time to fix production schema issues You stay in one tool: VS Code You learn while debugging Whether you're a student learning SQL or a developer on a fast moving team, this setup helps you recover faster and ship safer. Fixing migrations used to be all trial and error, digging through files and hoping nothing broke. It was time-consuming and stressful. Now with GitHub Copilot and GibsonAI, fixing issues is fast and simple. Copilot helps you write and correct migrations. GibsonAI lets you validate and deploy with confidence. So next time your migration fails, don’t panic. Just describe the issue to GitHub Copilot, run a quick check with GibsonAI, and get back to building. Ready to try it yourself? Sign up atgibsonai.com Want to Go Further? If you’re ready to explore more powerful workflows with GibsonAI, here are two great next steps: GibsonAI MCP Server – Enable Copilot Agent Mode to integrate schema intelligence directly into your dev environment. Automatic PR Creation for Schema Changes – The in-depth guide on how to automate pull requests for database updates using GibsonAI. Want to Know More About GitHub Copilot? Explore these resources to get the most out of Copilot: Get Started with GitHub Copilot Introduction to prompt engineering with GitHub Copilot GitHub Copilot Agent Mode GitHub Copilot Customization Use GitHub Copilot Agent Mode to create a Copilot Chat application in 5 minutes Deploy Your First App Using GitHub Copilot for Azure: A Beginner’s Guide That's it, folks! But the best part? You can become part of a thriving community of learners and builders by joining the Microsoft Student Ambassadors Community. Connect with like minded individuals, explore hands-on projects, and stay updated with the latest in cloud and AI. 💬 Join the community on Discord here and explore more benefits on the Microsoft Learn Student Hub.141Views2likes2CommentsCampusSphere: Building the Future of Campus AI with Microsoft's Agentic Framework
Project Overview We are a team of Imperial College Students committed to improving campus life through innovative multi-agent solutions. CampusSphere leverages Microsoft Azure AI capabilities to automate core university campus services. We created an end-to-end solution that allows both students and staff to access a multi-agent framework for room/gym booking, attendance tracking, calendar management, IoT monitoring and more. 🔭 Our Initial Vision: Reimagining Campus Technology When our team at Imperial College London embarked on the CampusSphere project as part of Microsoft's Agentic Campus initiative, we had one clear ambition: to create an intelligent campus ecosystem that would fundamentally change how students, faculty, and staff interact with university services. The inspiration came from a simple observation—despite living in an age of advanced AI, campus technology remained frustratingly fragmented. Students juggled multiple portals for course registration, room booking, dining services, and academic support. Faculty members navigated separate systems for teaching, research, and administrative tasks. The result? Countless hours wasted on mundane navigation tasks that could be better spent on learning, teaching, and innovation. Our vision was ambitious: create a single, intelligent interface that could understand natural language, anticipate user needs, and seamlessly integrate with existing campus infrastructure. We didn't just want to build another campus app—we wanted to demonstrate how Microsoft's agentic AI technologies could create a truly intelligent campus companion. 🧠 Enter CampusSphere CampusSphere is an intelligent campus assistant made up of multiple AI agents, each with a specific domain of expertise — all communicating seamlessly through a centralized architecture. Think of it as a digital concierge for campus life, where your calendar, attendance, IoT data, and facility bookings are coordinated by specialized GPT-powered agents. Here’s what we built: TriageAgent – the brain of the system, using Retrieval-Augmented Generation (RAG) to understand user intent CalendarAgent – handles scheduling, bookings, and reminders AttendanceAgent – tracks check-ins automatically IoTAgent – monitors real-time sensor data from classrooms and labs GymAgent – manages access and reservations for sports facilities 30+ MCP Tools – perform SQL queries, scrape web data, and connect with external APIs All of this is built on Microsoft Azure AI, Semantic Kernel, and Model Context Protocol (MCP) — making it scalable, secure, and lightning fast. 🖥️ The Tech Stack Our Azure-powered architecture showcases a modular and scalable approach to real-time data processing and intelligent agent coordination. The frontend is built using React with a Vite development server, providing a fast and responsive user interface. When users submit a prompt, it travels to a Flask backend server acting as the Triage agent, which intelligently delegates tasks to a FastAPI agent service. This FastAPI service asynchronously communicates with individual agents and handles responses efficiently. Complex queries are routed to MCP Tools, which interact with the CosmosDB-powered Campus Database. Simultaneously, real-time synthetic IoT data is pushed into the database via Azure Function Apps and Azure IoT Hub. Authentication is securely managed: users log in through the frontend, receive a token from the database API server, and use it for authorized access to MCP services, with permissions enforced based on user roles using our custom MCP server implementation. This robust architecture enables seamless integration, real-time data flow, and secure multi-agent collaboration across Azure services. Our system leverages a multi-agent architecture designed to intelligently coordinate task execution across specialized services. At the core is the TriageAgent, which uses Retrieval-Augmented Generation (RAG) to interpret user prompts, enrich them with relevant context, and determine the optimal response path. Based on the nature of the request, it may handle the response directly, seek clarification, or delegate tasks to specific agents via FastAPI. Each specialized agent has a clearly defined role: AttendanceAgent: Interfaces with CosmosDB-backed FastAPI endpoints to check student attendance, using filters like event name, student ID, or date. IoTAgent: Monitors room conditions (e.g., temperature, CO₂ levels) and flags anomalies using real-time data from Azure IoT Hub, processed via FastAPI. CalendarAgent: Handles scheduling, availability checks, and event creation by querying or updating CosmosDB through FastAPI. Future integration with Microsoft Graph API is planned for direct calendar syncing. Gym Slot Agent: Checks available times for gym sessions using dedicated MCP tools. The triage agent serves as the orchestrator, breaking down complex requests (like "Book a gym session") into subtasks. It consults relevant agents (e.g., calendar and gym slot agents), merges results, and then confirms the final action with the user. This distributed and asynchronous workflow reduces backend load and enhances both responsiveness and reliability of the system. 🔮 What’s Next? Integrating CampusSphere with live systems via Microsoft OAuth is crucial for enhancing its capabilities. This integration will grant the agent authenticated access to a wider range of student data, moving beyond synthetic datasets. This expanded access to real-world information will enable deeply personalized advice, such as tailored course selection, scholarship recommendations, event suggestions, and deadline reminders, transforming CampusSphere into a sophisticated, proactive personal assistant. 🤝Meet the Team Behind CampusSphere Our success stemmed from a diverse team of innovators who brought together expertise from multiple domains: Benny Liu - https://www.linkedin.com/in/zong-benny-liu-393a4621b/ Lucas Ng - https://www.linkedin.com/in/lucas-ng-11b317203/ Lu Ju - https://www.linkedin.com/in/lu-ju/ Bruno Duaso - https://www.linkedin.com/in/bruno-duaso-jimeno-744464262/ Martim Coutinho - https://www.linkedin.com/in/martim-pereira-coutinho-116308233/ Krischad Pourpongpan - https://www.linkedin.com/in/krischadpua/ Yixu Pan - https://www.linkedin.com/in/yixu-pan/ Our collaborative approach enabled us to create a sophisticated agentic AI system that demonstrates the powerful potential of Microsoft's AI technologies in educational environments. 🧑💻 Project Repository: GitHub - Imperial-Microsoft-Agentic-Campus/CampusSphere Contribute to Imperial-Microsoft-Agentic-Campus/CampusSphere development by creating an account on GitHub. github.com Have questions about implementing similar solutions at your institution? Connect with our team members on LinkedIn—we're always excited to share knowledge and collaborate on innovative campus technology projects. 📚Get Started with Microsoft's AI Tools Ready to explore the technologies that made CampusSphere possible? Here are essential resources: Microsoft Semantic Kernel: The core framework for building AI agent orchestration systems. Learn how to create, coordinate, and manage multiple AI agents working together seamlessly. AI Agents for Beginners: A comprehensive guide to understanding and building AI agents from the ground up. Perfect for getting started with agentic AI development. Model Context Protocol (MCP): Learn about the protocol that enables secure connections between AI models and external tools and services—essential for building integrated AI systems. Windows AI Toolkit: Microsoft's toolkit for developing AI applications on Windows, providing local AI model development capabilities and deployment tools. Azure Container Apps: Understand how to deploy and scale containerized AI applications in the cloud, perfect for hosting multi-agent systems. Azure Cosmos DB Security: Essential security practices for managing data in AI applications, covering encryption, access control, and compliance.333Views2likes0CommentsLearn How to Build Smarter AI Agents with Microsoft’s MCP Resources Hub
If you've been curious about how to build your own AI agents that can talk to APIs, connect with tools like databases, or even follow documentation you're in the right place. Microsoft has created something called MCP, which stands for Model‑Context‑Protocol. And to help you learn it step by step, they’ve made an amazing MCP Resources Hub on GitHub. In this blog, I’ll Walk you through what MCP is, why it matters, and how to use this hub to get started, even if you're new to AI development. What is MCP (Model‑Context‑Protocol)? Think of MCP like a communication bridge between your AI model and the outside world. Normally, when we chat with AI (like ChatGPT), it only knows what’s in its training data. But with MCP, you can give your AI real-time context from: APIs Documents Databases Websites This makes your AI agent smarter and more useful just like a real developer who looks up things online, checks documentation, and queries databases. What’s Inside the MCP Resources Hub? The MCP Resources Hub is a collection of everything you need to learn MCP: Videos Blogs Code examples Here are some beginner-friendly videos that explain MCP: Title What You'll Learn VS Code Agent Mode Just Changed Everything See how VS Code and MCP build an app with AI connecting to a database and following docs. The Future of AI in VS Code Learn how MCP makes GitHub Copilot smarter with real-time tools. Build MCP Servers using Azure Functions Host your own MCP servers using Azure in C#, .NET, or TypeScript. Use APIs as Tools with MCP See how to use APIs as tools inside your AI agent. Blazor Chat App with MCP + Aspire Create a chat app powered by MCP in .NET Aspire Tip: Start with the VS Code videos if you’re just beginning. Blogs Deep Dives and How-To Guides Microsoft has also written blogs that explain MCP concepts in detail. Some of the best ones include: Build AI agent tools using remote MCP with Azure Functions: Learn how to deploy MCP servers remotely using Azure. Create an MCP Server with Azure AI Agent Service : Enables Developers to create an agent with Azure AI Agent Service and uses the model context protocol (MCP) for consumption of the agents in compatible clients (VS Code, Cursor, Claude Desktop). Vibe coding with GitHub Copilot: Agent mode and MCP support: MCP allows you to equip agent mode with the context and capabilities it needs to help you, like a USB port for intelligence. When you enter a chat prompt in agent mode within VS Code, the model can use different tools to handle tasks like understanding database schema or querying the web. Enhancing AI Integrations with MCP and Azure API Management Enhance AI integrations using MCP and Azure API Management Understanding and Mitigating Security Risks in MCP Implementations Overview of security risks and mitigation strategies for MCP implementations Protecting Against Indirect Injection Attacks in MCP Strategies to prevent indirect injection attacks in MCP implementations Microsoft Copilot Studio MCP Announcement of the Microsoft Copilot Studio MCP lab Getting started with MCP for Beginners 9 part course on MCP Client and Servers Code Repositories Try it Yourself Want to build something with MCP? Microsoft has shared open-source sample code in Python, .NET, and TypeScript: Repo Name Language Description Azure-Samples/remote-mcp-apim-functions-python Python Recommended for Secure remote hosting Sample Python Azure Functions demonstrating remote MCP integration with Azure API Management Azure-Samples/remote-mcp-functions-python Python Sample Python Azure Functions demonstrating remote MCP integration Azure-Samples/remote-mcp-functions-dotnet C# Sample .NET Azure Functions demonstrating remote MCP integration Azure-Samples/remote-mcp-functions-typescript TypeScript Sample TypeScript Azure Functions demonstrating remote MCP integration Microsoft Copilot Studio MCP TypeScript Microsoft Copilot Studio MCP lab You can clone the repo, open it in VS Code, and follow the instructions to run your own MCP server. Using MCP with the AI Toolkit in Visual Studio Code To make your MCP journey even easier, Microsoft provides the AI Toolkit for Visual Studio Code. This toolkit includes: A built-in model catalog Tools to help you deploy and run models locally Seamless integration with MCP agent tools You can install the AI Toolkit extension from the Visual Studio Code Marketplace. Once installed, it helps you: Discover and select models quickly Connect those models to MCP agents Develop and test AI workflows locally before deploying to the cloud You can explore the full documentation here: Overview of the AI Toolkit for Visual Studio Code – Microsoft Learn This is perfect for developers who want to test things on their own system without needing a cloud setup right away. Why Should You Care About MCP? Because MCP: Makes your AI tools more powerful by giving them real-time knowledge Works with GitHub Copilot, Azure, and VS Code tools you may already use Is open-source and beginner-friendly with lots of tutorials and sample code It’s the future of AI development connecting models to the real world. Final Thoughts If you're learning AI or building software agents, don’t miss this valuable MCP Resources Hub. It’s like a starter kit for building smart, connected agents with Microsoft tools. Try one video or repo today. Experiment. Learn by doing and start your journey with the MCP for Beginners curricula.2.7KViews2likes2CommentsCombating Digitally Altered Images: Deepfake Detection
In today's digital age, the rise of deepfake technology poses significant threats to credibility, privacy, and security. This article delves into our Deepfake Detection Project, a robust solution designed to combat the misuse of AI-generated content.854Views2likes0CommentsAutomating PowerPoint Generation with AI: A Learn Live Series Case Study
Introduction A Learn Live is a series of events where over a period of 45 to 60 minutes, a presenter walks attendees through a learning module or pathway. The show/series, takes you through a Microsoft Learn Module, Challenge or a particular sample. Between April 15 to May 13, we will be hosting a Learn Live series on "Master the Skills to Create AI Agents." This premise is necessary for the blog because I was tasked with generating slides for the different presenters. Challenge: generation of the slides The series is based on the learning path: Develop AI agents on Azure and each session tackles one of the learn modules in the path. In addition, Learn Live series usually have a presentation template each speaker is provided with to help run their sessions. Each session has the same format as the learn modules: an introduction, lesson content, an exercise (demo), knowledge check and summary of the module. As the content is already there and the presentation template is provided, it felt repetitive to do create the slides one by one. And that's where AI comes in - automating slide generation for Learn Live modules. Step 1 - Gathering modules data The first step was ensuring I had the data for the learn modules, which involved collecting all the necessary information from the learning path and organizing it in a way that can be easily processed by AI. The learn modules repo is private and I have access to the repo, but I wanted to build a solution that can be used externally as well. So instead of getting the data from the repository, I decided to scrape the learn modules using BeautifulSoup into a word document. I created a python script to extract the data, and it works as follows: Retrieving the HTML – It sends HTTP requests to the start page and each unit page. Parsing Content – Using BeautifulSoup, it extracts elements (headings, paragraphs, lists, etc.) from the page’s main content. Populating a Document – With python-docx, it creates and formats a Word document, adding the scraped content. Handling Duplicates – It ensures unique unit page links by removing duplicates. Polite Scraping – A short delay (using time.sleep) is added between requests to avoid overloading the server. First, I installed the necessary libraries using: pip install requests beautifulsoup4 python-docx. Next, I ran the script below, converting the units of the learn modules to a word document: import requests from bs4 import BeautifulSoup from docx import Document from urllib.parse import urljoin import time headers = {"User-Agent": "Mozilla/5.0"} base_url = "https://learn.microsoft.com/en-us/training/modules/orchestrate-semantic-kernel-multi-agent-solution/" def get_soup(url): response = requests.get(url, headers=headers) return BeautifulSoup(response.content, "html.parser") def extract_module_unit_links(start_url): soup = get_soup(start_url) nav_section = soup.find("ul", {"id": "unit-list"}) if not nav_section: print("❌ Could not find unit navigation.") return [] links = [] for a in nav_section.find_all("a", href=True): href = a["href"] full_url = urljoin(base_url, href) links.append(full_url) return list(dict.fromkeys(links)) # remove duplicates while preserving order def extract_content(soup, doc): main_content = soup.find("main") if not main_content: return for tag in main_content.find_all(["h1", "h2", "h3", "p", "li", "pre", "code"]): text = tag.get_text().strip() if not text: continue if tag.name == "h1": doc.add_heading(text, level=1) elif tag.name == "h2": doc.add_heading(text, level=2) elif tag.name == "h3": doc.add_heading(text, level=3) elif tag.name == "p": doc.add_paragraph(text) elif tag.name == "li": doc.add_paragraph(f"• {text}", style='ListBullet') elif tag.name in ["pre", "code"]: doc.add_paragraph(text, style='Intense Quote') def scrape_full_module(start_url, output_filename="Learn_Module.docx"): doc = Document() # Scrape and add the content from the start page print(f"📄 Scraping start page: {start_url}") start_soup = get_soup(start_url) extract_content(start_soup, doc) all_unit_links = extract_module_unit_links(start_url) if not all_unit_links: print("❌ No unit links found. Exiting.") return print(f"🔗 Found {len(all_unit_links)} unit pages.") for i, url in enumerate(all_unit_links, start=1): print(f"📄 Scraping page {i}: {url}") soup = get_soup(url) extract_content(soup, doc) time.sleep(1) # polite delay doc.save(output_filename) print(f"\n✅ Saved module to: {output_filename}") # 🟡 Replace this with any Learn module start page start_page = "https://learn.microsoft.com/en-us/training/modules/orchestrate-semantic-kernel-multi-agent-solution/" scrape_full_module(start_page, "Orchestrate with SK.docx") Step 2 - Utilizing Microsoft Copilot in PowerPoint To automate the slide generation, I used Microsoft Copilot in PowerPoint. This tool leverages AI to create slides based on the provided data. It simplifies the process and ensures consistency across all presentations. As I already had the slide template, I created a new presentation based on the template. Next, I used copilot in PowerPoint to generate the slides based on the presentation. How did I achieve this? I uploaded the word document generated from the learn modules to OneDrive In PowerPoint, I went over to Copilot and selected ```view prompts```, and selected the prompt: create presentations Next, I added the prompt below and the word document to generate the slides from the file. Create a set of slides based on the content of the document titled "Orchestrate with SK". The slides should cover the following sections: • Introduction • Understand the Semantic Kernel Agent Framework • Design an agent selection strategy • Define a chat termination strategy • Exercise - Develop a multi-agent solution • Knowledge check • Summary Slide Layout: Use the custom color scheme and layout provided in the template. Use Segoe UI family fonts for text and Consolas for code. Include visual elements such as images, charts, and abstract shapes where appropriate. Highlight key points and takeaways. Step 3 - Evaluating and Finalizing Slides Once the slides are generated, if you are happy with how they look, select keep it. The slides were generated based on the sessions I selected and had all the information needed. The next step was to evaluate the generated slides, add the Learn Live introduction, knowledge check and conclusion. The goal is to create high-quality presentations that effectively convey the learning content. What more can you do with Copilot in PowerPoint? Add speaker notes to the slides Use agents within PowerPoint to streamline your workflow. Create your own custom prompts for future use cases Summary - AI for automation In summary, using AI for slide generation can significantly streamline the process and save time. I was able to automate my work and only come in as a reviewer. The script and PowerPoint generation all took about 10 minutes, something that would have previously taken me an hour and I only needed to counter review based on the learn modules. It allowed for the creation of consistent and high-quality presentations, making it easier for presenters to focus on delivering the content. Now, my question to you is, how can you use AI in your day to day and automate any repetitive tasks?912Views1like0Comments