ai
4 TopicsCombating Digitally Altered Images: Deepfake Detection
In today's digital age, the rise of deepfake technology poses significant threats to credibility, privacy, and security. This article delves into our Deepfake Detection Project, a robust solution designed to combat the misuse of AI-generated content.353Views2likes0CommentsHow to use any Python AI agent framework with free GitHub Models
I ❤️ when companies offer free tiers for developer services, since it gives everyone a way to learn new technologies without breaking the bank. Free tiers are especially important for students and people between jobs, when the desire to learn is high but the available cash is low. That's why I'm such a fan of GitHub Models: free, high-quality generative AI models available to anyone with a GitHub account. The available models include the latest OpenAI LLMs (like o3-mini), LLMs from the research community (like Phi and Llama), LLMs from other popular providers (like Mistral and Jamba), multimodal models (like gpt-4o and llama-vision-instruct) and even a few embedding models (from OpenAI and Cohere). With access to such a range of models, you can prototype complex multi-model workflows to improve your productivity or heck, just make something fun for yourself. 🤗 To use GitHub Models, you can start off in no-code mode: open the playground for a model, send a few requests, tweak the parameters, and check out the answers. When you're ready to write code, select "Use this model". A screen will pop up where you can select a programming language (Python/JavaScript/C#/Java/REST) and select an SDK (which varies depending on model). Then you'll get instructions and code for that model, language, and SDK. But here's what's really cool about GitHub Models: you can use them with all the popular Python AI frameworks, even if the framework has no specific integration with GitHub Models. How is that possible? The vast majority of Python AI frameworks support the OpenAI Chat Completions API, since that API became a defacto standard supported by many LLM API providers besides OpenAI itself. GitHub Models also provide OpenAI-compatible endpoints for chat completion models. Therefore, any Python AI framework that supports OpenAI-like models can be used with GitHub Models as well. 🎉 To prove it, I've made a new repository with examples from eight different Python AI agent packages, all working with GitHub Models: python-ai-agent-frameworks-demos. There are examples for AutoGen, LangGraph, Llamaindex, OpenAI Agents SDK, OpenAI standard SDK, PydanticAI, Semantic Kernel, and SmolAgents. You can open that repository in GitHub Codespaces, install the packages, and get the examples running immediately. Now let's walk through the API connection code for GitHub Models for each framework. Even if I missed your favorite framework, I hope my tips here will help you connect any framework to GitHub Models. OpenAI I'll start with openai , the package that started it all! import openai client = openai.OpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") The code above demonstrates the two key parameters we'll need to configure for all frameworks: api_key : When using OpenAI.com, you pass your OpenAI API key here. When using GitHub Models, you pass in a Personal Access Token (PAT). If you open the repository (or any repository) in GitHub Codespaces, a PAT is already stored in the GITHUB_TOKEN environment variable. However, if you're working locally with GitHub Models, you'll need to generate a PAT yourself and store it. PATs expire after a while, so you need to generate new PATs every so often. base_url : This parameter tells the OpenAI client to send all requests to "https://models.inference.ai.azure.com" instead of the OpenAI.com API servers. That's the domain that hosts the OpenAI-compatible endpoint for GitHub Models, so you'll always pass that domain as the base URL. If we're working with the new openai-agents SDK, we use very similar code, but we must use the AsyncOpenAI client from openai instead. Lately, Python AI packages are defaulting to async, because it's so much better for performance. import agents import openai client = openai.AsyncOpenAI( base_url="https://models.inference.ai.azure.com", api_key=os.environ["GITHUB_TOKEN"]) model = agents.OpenAIChatCompletionsModel( model="gpt-4o", openai_client=client) spanish_agent = agents.Agent( name="Spanish agent", instructions="You only speak Spanish.", model=model) PydanticAI Now let's look at all of the packages that make it really easy for us, by allowing us to directly bring in an instance of either OpenAI or AsyncOpenAI . For PydanticAI, we configure an AsyncOpenAI client, then construct an OpenAIModel object from PydanticAI, and pass that model to the agent: import openai import pydantic_ai import pydantic_ai.models.openai client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") model = pydantic_ai.models.openai.OpenAIModel( "gpt-4o", provider=OpenAIProvider(openai_client=client)) spanish_agent = pydantic_ai.Agent( model, system_prompt="You only speak Spanish.") Semantic Kernel For Semantic Kernel, the code is very similar. We configure an AsyncOpenAI client, then construct an OpenAIChatCompletion object from Semantic Kernel, and add that object to the kernel. import openai import semantic_kernel.connectors.ai.open_ai import semantic_kernel.agents chat_client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") chat = semantic_kernel.connectors.ai.open_ai.OpenAIChatCompletion( ai_model_id="gpt-4o", async_client=chat_client) kernel.add_service(chat) spanish_agent = semantic_kernel.agents.ChatCompletionAgent( kernel=kernel, name="Spanish agent" instructions="You only speak Spanish") AutoGen Next, we'll check out a few frameworks that have their own wrapper of the OpenAI clients, so we won't be using any classes from openai directly. For AutoGen, we configure both the OpenAI parameters and the model name in the same object, then pass that to each agent: import autogen_ext.models.openai import autogen_agentchat.agents client = autogen_ext.models.openai.OpenAIChatCompletionClient( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") spanish_agent = autogen_agentchat.agents.AssistantAgent( "spanish_agent", model_client=client, system_message="You only speak Spanish") LangGraph For LangGraph, we configure a very similar object, which even has the same parameter names: import langchain_openai import langgraph.graph model = langchain_openai.ChatOpenAI( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com", ) def call_model(state): messages = state["messages"] response = model.invoke(messages) return {"messages": [response]} workflow = langgraph.graph.StateGraph(MessagesState) workflow.add_node("agent", call_model) SmolAgents Once again, for SmolAgents, we configure a similar object, though with slightly different parameter names: import smolagents model = smolagents.OpenAIServerModel( model_id="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = smolagents.CodeAgent(model=model) Llamaindex I saved Llamaindex for last, as it is the most different. The llama-index package has a different constructor for OpenAI.com versus OpenAI-like servers, so I opted to use that OpenAILike constructor instead. However, I also needed an embeddings model for my example, and the package doesn't have an OpenAIEmbeddingsLike constructor, so I used the standard OpenAIEmbedding constructor. import llama_index.embeddings.openai import llama_index.llms.openai_like import llama_index.core.agent.workflow Settings.llm = llama_index.llms.openai_like.OpenAILike( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com", is_chat_model=True) Settings.embed_model = llama_index.embeddings.openai.OpenAIEmbedding( model="text-embedding-3-small", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = llama_index.core.agent.workflow.ReActAgent( tools=query_engine_tools, llm=Settings.llm) Choose your models wisely! In all of the examples above, I specified the gpt-4o model. The gpt-4o model is a great choice for agents because it supports function calling, and many agent frameworks only work (or work best) with models that natively support function calling. Fortunately, GitHub Models includes multiple models that support function calling, at least in my basic experiments: gpt-4o gpt-4o-mini o3-mini AI21-Jamba-1.5-Large AI21-Jamba-1.5-Mini Codestral-2501 Cohere-command-r Ministral-3B Mistral-Large-2411 Mistral-Nemo Mistral-small You might find that some models work better than others, especially if you're using agents with multiple tools. With GitHub Models, it's very easy to experiment and see for yourself, by simply changing the model name and re-running the code. Join the AI Agents Hackathon We are currently running a free virtual hackathon from April 8th - 30th, to challenge developers to create agentic applications using Microsoft technologies. You could build an agent entirely using GitHub Models and submit it to the hackathon for a chance to win amazing prizes! You can also join our 30+ streams about building AI agents, including a stream all about prototyping with GitHub Models. Learn more and register at https://aka.ms/agentshack1.1KViews3likes0CommentsAI Agents: Building Trustworthy Agents- Part 6
This blog post, Part 6 in a series on AI agents, focuses on building trustworthy AI agents. It emphasizes the importance of safety and security in agent design and deployment. The post details a system message framework for creating robust and scalable prompts, outlining a four-step process from meta prompt to iterative refinement. It then explores various threats to AI agents, including task manipulation, unauthorized access, resource overloading, knowledge base poisoning, and cascading errors, providing mitigation strategies for each. The post also highlights the human-in-the-loop approach for enhanced trust and control, providing a code example using AutoGen. Finally, it links to further resources on responsible AI, model evaluation, and risk assessment, along with the previous posts in the series.390Views3likes0CommentsAutomating PowerPoint Generation with AI: A Learn Live Series Case Study
Introduction A Learn Live is a series of events where over a period of 45 to 60 minutes, a presenter walks attendees through a learning module or pathway. The show/series, takes you through a Microsoft Learn Module, Challenge or a particular sample. Between April 15 to May 13, we will be hosting a Learn Live series on "Master the Skills to Create AI Agents." This premise is necessary for the blog because I was tasked with generating slides for the different presenters. Challenge: generation of the slides The series is based on the learning path: Develop AI agents on Azure and each session tackles one of the learn modules in the path. In addition, Learn Live series usually have a presentation template each speaker is provided with to help run their sessions. Each session has the same format as the learn modules: an introduction, lesson content, an exercise (demo), knowledge check and summary of the module. As the content is already there and the presentation template is provided, it felt repetitive to do create the slides one by one. And that's where AI comes in - automating slide generation for Learn Live modules. Step 1 - Gathering modules data The first step was ensuring I had the data for the learn modules, which involved collecting all the necessary information from the learning path and organizing it in a way that can be easily processed by AI. The learn modules repo is private and I have access to the repo, but I wanted to build a solution that can be used externally as well. So instead of getting the data from the repository, I decided to scrape the learn modules using BeautifulSoup into a word document. I created a python script to extract the data, and it works as follows: Retrieving the HTML – It sends HTTP requests to the start page and each unit page. Parsing Content – Using BeautifulSoup, it extracts elements (headings, paragraphs, lists, etc.) from the page’s main content. Populating a Document – With python-docx, it creates and formats a Word document, adding the scraped content. Handling Duplicates – It ensures unique unit page links by removing duplicates. Polite Scraping – A short delay (using time.sleep) is added between requests to avoid overloading the server. First, I installed the necessary libraries using: pip install requests beautifulsoup4 python-docx. Next, I ran the script below, converting the units of the learn modules to a word document: import requests from bs4 import BeautifulSoup from docx import Document from urllib.parse import urljoin import time headers = {"User-Agent": "Mozilla/5.0"} base_url = "https://learn.microsoft.com/en-us/training/modules/orchestrate-semantic-kernel-multi-agent-solution/" def get_soup(url): response = requests.get(url, headers=headers) return BeautifulSoup(response.content, "html.parser") def extract_module_unit_links(start_url): soup = get_soup(start_url) nav_section = soup.find("ul", {"id": "unit-list"}) if not nav_section: print("❌ Could not find unit navigation.") return [] links = [] for a in nav_section.find_all("a", href=True): href = a["href"] full_url = urljoin(base_url, href) links.append(full_url) return list(dict.fromkeys(links)) # remove duplicates while preserving order def extract_content(soup, doc): main_content = soup.find("main") if not main_content: return for tag in main_content.find_all(["h1", "h2", "h3", "p", "li", "pre", "code"]): text = tag.get_text().strip() if not text: continue if tag.name == "h1": doc.add_heading(text, level=1) elif tag.name == "h2": doc.add_heading(text, level=2) elif tag.name == "h3": doc.add_heading(text, level=3) elif tag.name == "p": doc.add_paragraph(text) elif tag.name == "li": doc.add_paragraph(f"• {text}", style='ListBullet') elif tag.name in ["pre", "code"]: doc.add_paragraph(text, style='Intense Quote') def scrape_full_module(start_url, output_filename="Learn_Module.docx"): doc = Document() # Scrape and add the content from the start page print(f"📄 Scraping start page: {start_url}") start_soup = get_soup(start_url) extract_content(start_soup, doc) all_unit_links = extract_module_unit_links(start_url) if not all_unit_links: print("❌ No unit links found. Exiting.") return print(f"🔗 Found {len(all_unit_links)} unit pages.") for i, url in enumerate(all_unit_links, start=1): print(f"📄 Scraping page {i}: {url}") soup = get_soup(url) extract_content(soup, doc) time.sleep(1) # polite delay doc.save(output_filename) print(f"\n✅ Saved module to: {output_filename}") # 🟡 Replace this with any Learn module start page start_page = "https://learn.microsoft.com/en-us/training/modules/orchestrate-semantic-kernel-multi-agent-solution/" scrape_full_module(start_page, "Orchestrate with SK.docx") Step 2 - Utilizing Microsoft Copilot in PowerPoint To automate the slide generation, I used Microsoft Copilot in PowerPoint. This tool leverages AI to create slides based on the provided data. It simplifies the process and ensures consistency across all presentations. As I already had the slide template, I created a new presentation based on the template. Next, I used copilot in PowerPoint to generate the slides based on the presentation. How did I achieve this? I uploaded the word document generated from the learn modules to OneDrive In PowerPoint, I went over to Copilot and selected ```view prompts```, and selected the prompt: create presentations Next, I added the prompt below and the word document to generate the slides from the file. Create a set of slides based on the content of the document titled "Orchestrate with SK". The slides should cover the following sections: • Introduction • Understand the Semantic Kernel Agent Framework • Design an agent selection strategy • Define a chat termination strategy • Exercise - Develop a multi-agent solution • Knowledge check • Summary Slide Layout: Use the custom color scheme and layout provided in the template. Use Segoe UI family fonts for text and Consolas for code. Include visual elements such as images, charts, and abstract shapes where appropriate. Highlight key points and takeaways. Step 3 - Evaluating and Finalizing Slides Once the slides are generated, if you are happy with how they look, select keep it. The slides were generated based on the sessions I selected and had all the information needed. The next step was to evaluate the generated slides, add the Learn Live introduction, knowledge check and conclusion. The goal is to create high-quality presentations that effectively convey the learning content. What more can you do with Copilot in PowerPoint? Add speaker notes to the slides Use agents within PowerPoint to streamline your workflow. Create your own custom prompts for future use cases Summary - AI for automation In summary, using AI for slide generation can significantly streamline the process and save time. I was able to automate my work and only come in as a reviewer. The script and PowerPoint generation all took about 10 minutes, something that would have previously taken me an hour and I only needed to counter review based on the learn modules. It allowed for the creation of consistent and high-quality presentations, making it easier for presenters to focus on delivering the content. Now, my question to you is, how can you use AI in your day to day and automate any repetitive tasks?514Views1like0Comments