llm
37 TopicsHow to use any Python AI agent framework with free GitHub Models
I ❤️ when companies offer free tiers for developer services, since it gives everyone a way to learn new technologies without breaking the bank. Free tiers are especially important for students and people between jobs, when the desire to learn is high but the available cash is low. That's why I'm such a fan of GitHub Models: free, high-quality generative AI models available to anyone with a GitHub account. The available models include the latest OpenAI LLMs (like o3-mini), LLMs from the research community (like Phi and Llama), LLMs from other popular providers (like Mistral and Jamba), multimodal models (like gpt-4o and llama-vision-instruct) and even a few embedding models (from OpenAI and Cohere). With access to such a range of models, you can prototype complex multi-model workflows to improve your productivity or heck, just make something fun for yourself. 🤗 To use GitHub Models, you can start off in no-code mode: open the playground for a model, send a few requests, tweak the parameters, and check out the answers. When you're ready to write code, select "Use this model". A screen will pop up where you can select a programming language (Python/JavaScript/C#/Java/REST) and select an SDK (which varies depending on model). Then you'll get instructions and code for that model, language, and SDK. But here's what's really cool about GitHub Models: you can use them with all the popular Python AI frameworks, even if the framework has no specific integration with GitHub Models. How is that possible? The vast majority of Python AI frameworks support the OpenAI Chat Completions API, since that API became a defacto standard supported by many LLM API providers besides OpenAI itself. GitHub Models also provide OpenAI-compatible endpoints for chat completion models. Therefore, any Python AI framework that supports OpenAI-like models can be used with GitHub Models as well. 🎉 To prove it, I've made a new repository with examples from eight different Python AI agent packages, all working with GitHub Models: python-ai-agent-frameworks-demos. There are examples for AutoGen, LangGraph, Llamaindex, OpenAI Agents SDK, OpenAI standard SDK, PydanticAI, Semantic Kernel, and SmolAgents. You can open that repository in GitHub Codespaces, install the packages, and get the examples running immediately. Now let's walk through the API connection code for GitHub Models for each framework. Even if I missed your favorite framework, I hope my tips here will help you connect any framework to GitHub Models. OpenAI I'll start with openai , the package that started it all! import openai client = openai.OpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") The code above demonstrates the two key parameters we'll need to configure for all frameworks: api_key : When using OpenAI.com, you pass your OpenAI API key here. When using GitHub Models, you pass in a Personal Access Token (PAT). If you open the repository (or any repository) in GitHub Codespaces, a PAT is already stored in the GITHUB_TOKEN environment variable. However, if you're working locally with GitHub Models, you'll need to generate a PAT yourself and store it. PATs expire after a while, so you need to generate new PATs every so often. base_url : This parameter tells the OpenAI client to send all requests to "https://models.inference.ai.azure.com" instead of the OpenAI.com API servers. That's the domain that hosts the OpenAI-compatible endpoint for GitHub Models, so you'll always pass that domain as the base URL. If we're working with the new openai-agents SDK, we use very similar code, but we must use the AsyncOpenAI client from openai instead. Lately, Python AI packages are defaulting to async, because it's so much better for performance. import agents import openai client = openai.AsyncOpenAI( base_url="https://models.inference.ai.azure.com", api_key=os.environ["GITHUB_TOKEN"]) model = agents.OpenAIChatCompletionsModel( model="gpt-4o", openai_client=client) spanish_agent = agents.Agent( name="Spanish agent", instructions="You only speak Spanish.", model=model) PydanticAI Now let's look at all of the packages that make it really easy for us, by allowing us to directly bring in an instance of either OpenAI or AsyncOpenAI . For PydanticAI, we configure an AsyncOpenAI client, then construct an OpenAIModel object from PydanticAI, and pass that model to the agent: import openai import pydantic_ai import pydantic_ai.models.openai client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") model = pydantic_ai.models.openai.OpenAIModel( "gpt-4o", provider=OpenAIProvider(openai_client=client)) spanish_agent = pydantic_ai.Agent( model, system_prompt="You only speak Spanish.") Semantic Kernel For Semantic Kernel, the code is very similar. We configure an AsyncOpenAI client, then construct an OpenAIChatCompletion object from Semantic Kernel, and add that object to the kernel. import openai import semantic_kernel.connectors.ai.open_ai import semantic_kernel.agents chat_client = openai.AsyncOpenAI( api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") chat = semantic_kernel.connectors.ai.open_ai.OpenAIChatCompletion( ai_model_id="gpt-4o", async_client=chat_client) kernel.add_service(chat) spanish_agent = semantic_kernel.agents.ChatCompletionAgent( kernel=kernel, name="Spanish agent" instructions="You only speak Spanish") AutoGen Next, we'll check out a few frameworks that have their own wrapper of the OpenAI clients, so we won't be using any classes from openai directly. For AutoGen, we configure both the OpenAI parameters and the model name in the same object, then pass that to each agent: import autogen_ext.models.openai import autogen_agentchat.agents client = autogen_ext.models.openai.OpenAIChatCompletionClient( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com") spanish_agent = autogen_agentchat.agents.AssistantAgent( "spanish_agent", model_client=client, system_message="You only speak Spanish") LangGraph For LangGraph, we configure a very similar object, which even has the same parameter names: import langchain_openai import langgraph.graph model = langchain_openai.ChatOpenAI( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], base_url="https://models.inference.ai.azure.com", ) def call_model(state): messages = state["messages"] response = model.invoke(messages) return {"messages": [response]} workflow = langgraph.graph.StateGraph(MessagesState) workflow.add_node("agent", call_model) SmolAgents Once again, for SmolAgents, we configure a similar object, though with slightly different parameter names: import smolagents model = smolagents.OpenAIServerModel( model_id="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = smolagents.CodeAgent(model=model) Llamaindex I saved Llamaindex for last, as it is the most different. The llama-index package has a different constructor for OpenAI.com versus OpenAI-like servers, so I opted to use that OpenAILike constructor instead. However, I also needed an embeddings model for my example, and the package doesn't have an OpenAIEmbeddingsLike constructor, so I used the standard OpenAIEmbedding constructor. import llama_index.embeddings.openai import llama_index.llms.openai_like import llama_index.core.agent.workflow Settings.llm = llama_index.llms.openai_like.OpenAILike( model="gpt-4o", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com", is_chat_model=True) Settings.embed_model = llama_index.embeddings.openai.OpenAIEmbedding( model="text-embedding-3-small", api_key=os.environ["GITHUB_TOKEN"], api_base="https://models.inference.ai.azure.com") agent = llama_index.core.agent.workflow.ReActAgent( tools=query_engine_tools, llm=Settings.llm) Choose your models wisely! In all of the examples above, I specified the gpt-4o model. The gpt-4o model is a great choice for agents because it supports function calling, and many agent frameworks only work (or work best) with models that natively support function calling. Fortunately, GitHub Models includes multiple models that support function calling, at least in my basic experiments: gpt-4o gpt-4o-mini o3-mini AI21-Jamba-1.5-Large AI21-Jamba-1.5-Mini Codestral-2501 Cohere-command-r Ministral-3B Mistral-Large-2411 Mistral-Nemo Mistral-small You might find that some models work better than others, especially if you're using agents with multiple tools. With GitHub Models, it's very easy to experiment and see for yourself, by simply changing the model name and re-running the code. Join the AI Agents Hackathon We are currently running a free virtual hackathon from April 8th - 30th, to challenge developers to create agentic applications using Microsoft technologies. You could build an agent entirely using GitHub Models and submit it to the hackathon for a chance to win amazing prizes! You can also join our 30+ streams about building AI agents, including a stream all about prototyping with GitHub Models. Learn more and register at https://aka.ms/agentshack2KViews3likes0CommentsModel Context Protocol (MCP) Server for Azure Database for MySQL
We are excited to introduce a new MCP Server for integrating your AI models with data hosted in Azure Database for MySQL. By utilizing this server, you can effortlessly connect any AI application that supports MCP to your MySQL flexible server (using either MySQL password-based authentication or Microsoft Entra authentication methods), enabling you to provide your business data as meaningful context in a standardized and secure manner.1.8KViews2likes0CommentsIs it a bug or a feature? Using Prompty to automatically track and tag issues.
Introduction You’ve probably noticed a theme in my recent posts: tackling challenges with AI-powered solutions. In my latest project, I needed a fast way to classify and categorize GitHub issues using a predefined set of tags. The tag data was there, but the connections between issues and tags weren’t. To bridge that gap, I combined Azure OpenAI Service, Prompty, and a GitHub to automatically extract and assign the right labels. By automating issue tagging, I was able to: Streamline contributor workflows with consistent, on-time labels that simplify triage Improve repository hygiene by keeping issues well-organized, searchable, and easy to navigate Eliminate repetitive maintenance so the team can focus on community growth and developer empowerment Scale effortlessly as the project expands, turning manual chores into intelligent automation Challenge: 46 issues, no tags The Prompty repository currently hosts 46 relevant, but untagged, issues. To automate labeling, I first defined a complete tag taxonomy. Then I built a solution using: Prompty for prompt templating and function calling Azure OpenAI (gpt-4o-mini) to classify each issue Azure AI Search for retrieval-augmented context (RAG) Python to orchestrate the workflow and integrate with GitHub By the end, you’ll have an autonomous agent that fetches open issues, matches them against your custom taxonomy, and applies labels back on GitHub. Prerequisites: An Azure account with Azure AI Search and Azure OpenAI enabled Python and Prompty installed Clone the repo and install dependencies: pip install -r requirements.txt Step 1: Define the prompt template We’ll use Prompty to structure our LLM instructions. If you haven’t yet, install the Prompty VS Code extension and refer to the Prompty docs to get started. Prompty combines: Tooling to configure and deploy models Runtime for executing prompts and function calls Specification (YAML) for defining prompts, inputs, and outputs Our Prompty is set to use gpt-4o-mini and below is our sample input: sample: title: Including Image in System Message tags: ${file:tags.json} description: An error arises in the flow, coming up starting from the "complete" block. It seems like it is caused by placing a static image in the system prompt, since removing it causes the issue to go away. Please let me know if I can provide additional context. The inputs will be the tags file implemented using RAG, then we will fetch the issue title and description from GitHub once a new issue is posted. Next, in our Prompty file, we gave instructions of how the LLLM should work as follows: system: You are an intelligent GitHub issue tagging assistant. Available tags: ${inputs} {% if tags.tags %} ## Available Tags {% for tag in tags.tags %} name: {{tag.name}} description: {{tag.description}} {% endfor %} {% endif %} Guidelines: 1. Only select tags that exactly match the provided list above 2. If no tags apply, return an empty array [] 3. Return ONLY a valid JSON array of strings, nothing else 4. Do not explain your choices or add any other text Use your understanding of the issue and refer to documentation at https://prompty.ai to match appropriate tags. Tags may refer to: - Issue type (e.g., bug, enhancement, documentation) - Tool or component (e.g., tool:cli, tracer:json-tracer) - Technology or integration (e.g., integration:azure, runtime:python) - Conceptual elements (e.g., asset:template-loading) Return only a valid JSON array of the issue title, description and tags. If the issue does not fit in any of the categories, return an empty array with: ["No tags apply to this issue. Please review the issue and try again."] Example: Issue Title: "App crashes when running in Azure CLI" Issue Body: "Running the generated code in Azure CLI throws a Python runtime error." Tag List: ["bug", "tool:cli", "runtime:python", "integration:azure"] Output: ["bug", "tool:cli", "runtime:python", "integration:azure"] user: Issue Title: {{title}} Issue Description: {{description}} Once the Prompty file was ready, I right clicked on the file and converted it to Prompty code, which provided a Python base code to get started from, instead of building from scratch. Step 2: enrich with context using Azure AI Search To be able to generate labels for our issues, I created a sample of tags, around 20, each with a title and a description of what it does. As a starting point, I started with Azure AI Foundry, where I uploaded the data and created an index. This typically takes about 1hr to successfully complete. Next, I implemented a retrieval function: def query_azure_search(query_text): """Query Azure AI Search for relevant documents and tags.""" search_client = SearchClient( endpoint=SEARCH_SERVICE_ENDPOINT, index_name=SEARCH_INDEX_NAME, credential=AzureKeyCredential(SEARCH_API_KEY) ) # Perform the search results = search_client.search( search_text=query_text, query_type=QueryType.SIMPLE, top=5 # Retrieve top 5 results ) # Extract content and tags from results documents = [doc["content"] for doc in results] tags = [doc.get("tags", []) for doc in results] # Assuming "tags" is a field in the index # Flatten and deduplicate tags unique_tags = list(set(tag for tag_list in tags for tag in tag_list)) return documents, unique_tags Step 3: Orchestrate the Workflow In addition, to adding RAG, I added functions in the basic.py file to: fetch_github_issues: calls the GitHub REST API to list open issues and filters out any that already have labels. run_with_rag: on the issues selected, calls the query_azure_search to append any retrieved docs, tags the issues and parses the JSON output from the prompt to a list for the labels label_issue: patches the issue to apply a list of labels. process_issues: this fetches all unlabelled issues, extracts the rag pipeline to generate the tags, and calls the labels_issue tag to apply the tags scheduler loop: this runs every so often to check if there's a new issue and apply a label Step 4: Validate and Run Ensure all .env variables are set (API keys, endpoints, token). Install dependencies and execute using: python basic.py Create a new GitHub issue and watch as your agent assigns tags in real time. Below is a short demo video here to illustrate the workflow. Next Steps Migrate from PATs to a GitHub App for tighter security Create multi-agent application and add an evaluator agent to review tags before publishing Integrate with GitHub Actions or Azure Pipelines for CI/CD Conclusion and Resources By combining Prompty, Azure AI Search, and Azure OpenAI, you can fully automate GitHub issue triage—improving consistency, saving time, and scaling effortlessly. Adapt this pattern to any classification task in your own workflows! You can learn more using the following resources: Prompty documentation to learn more on Prompty Agents for Beginners course to learn how you can build your own agentAI Sparks: Unleashing Agents with the AI Toolkit
The final episode of our "AI Sparks" series delved deep into the exciting world of AI Agents and their practical implementation. We also covered a fair part of MCP with Microsoft AI Toolkit extension for VS Code. We kicked off by charting the evolutionary path of intelligent conversational systems. Starting with the rudimentary rule-based Basic Chatbots, we then explored the advancements brought by Basic Generative AI Chatbots, which offered contextually aware interactions. Then we explored the Retrieval-Augmented Generation (RAG), highlighting its ability to ground generative models in specific knowledge bases, significantly enhancing accuracy and relevance. The limitations were also discussed for the above mentioned techniques. The session was then centralized to the theme – Agents and Agentic Frameworks. We uncovered the fundamental shift from basic chatbots to autonomous agents capable of planning, decision-making, and executing tasks. We moved forward with detailed discussion on the distinction between Single Agentic systems, where one core agent orchestrates the process, and Multi-Agent Architectures, where multiple specialized agents collaborate to achieve complex goals. A key part of building robust and reliable AI Agents, as we discussed, revolves around carefully considering four critical factors. Firstly, Knowledge-Providing agents with the right context is paramount for them to operate effectively and make informed decisions. Secondly, equipping agents with the necessary Actions by granting them access to the appropriate tools allows them to execute tasks and achieve desired outcomes. Thirdly, Security is non-negotiable; ensuring agents have access only to the data and services they genuinely need is crucial for maintaining privacy and preventing unintended actions. Finally, establishing robust Evaluations mechanisms is essential to verify that agents are completing tasks correctly and meeting the required standards. These four pillars – Knowledge, Actions, Security, and Evaluation – form the bedrock of any successful agentic implementation. To illustrate the transformative power of AI Agents, we explored several interesting use cases and applications. These ranged from intelligent personal assistants capable of managing schedules and automating workflows to sophisticated problem-solving systems in domains like customer service. A significant portion of the session was dedicated to practical implementation through demonstrations. We highlighted key frameworks that are empowering developers to build agentic systems.: Semantic Kernel: We highlighted its modularity and rich set of features for integrating various AI services and tools. Autogen Studio: The focus here was on its capabilities for facilitating the creation and management of multi-agent conversations and workflows. Agent Service: We discussed its role in providing a more streamlined and managed environment for deploying and scaling AI agents. The major point of attraction was that these were demonstrated using the local LLMs which were hosted using AI Toolkit. This showcased the ease with which developers can utilize VS Code AI toolkit to build and experiment with agentic workflows directly within their familiar development environment. Finally, we demystified the concept of Model Context Protocol (MCP) and demonstrated how seamlessly it can be implemented using the Agent Builder within the VS Code AI Toolkit. We demonstrated this with a basic Website development using MCP. This practical demonstration underscored the toolkit's power in simplifying the development of complex solutions that can maintain context and engage in more natural, multi-step interactions. The "AI Sparks" series concluded with a discussion, where attendees had a clearer understanding of the evolution, potential and practicalities of AI Agents. The session underscored that we are on the cusp of a new era of intelligent systems that are not just reactive but actively work alongside us to achieve goals. The tools and frameworks are maturing, and the possibilities for agentic applications are sparking innovation across various industries. It was an exciting journey, and engagement during the final session on AI Sparks around Agents truly highlighted the transformative potential of this field. "AI Sparks" Series Roadmap: The "AI Sparks" series delved deeper into specific topics using AI Toolkit for Visual Studio Code, including: Introduction to AI toolkit and feature walkthrough: Introduction to the AI Toolkit extension for VS Code a powerful way to explore and integrate the latest AI models from OpenAI, Meta, Deepseek, Mistral, and more. Introduction to SLMs and local model with use cases: Explore Small Language Models (SLMs) and how they compare to larger models. Building RAG Applications: Create powerful applications that combine the strengths of LLMs with external knowledge sources. Multimodal Support and Image Analysis: Working with vision models and building multimodal applications. Evaluation and Model Selection: Evaluate model performance and choose the best model for your needs. Agents and Agentic Frameworks: Exploring the cutting edge of AI agents and how they can be used to build more complex and autonomous systems. The full playlist of the series with all the episodes of "AI Sparks" is available at AI Sparks Playlist. Continue the discussion and questions in Microsoft AI Discord Community where we have a dedicated AI-sparks channel. All the code samples can be found on AI_Toolkit_Samples .We look forward to continuing these insightful discussions in future series!332Views2likes0CommentsIntroducing langchain-azure-storage: Azure Storage integrations for LangChain
We're excited to introduce langchain-azure-storage , the first official Azure Storage integration package built by Microsoft for LangChain 1.0. As part of its launch, we've built a new Azure Blob Storage document loader (currently in public preview) that improves upon prior LangChain community implementations. This new loader unifies both blob and container level access, simplifying loader integration. More importantly, it offers enhanced security through default OAuth 2.0 authentication, supports reliably loading millions to billions of documents through efficient memory utilization, and allows pluggable parsing, so you can leverage other document loaders to parse specific file formats. What are LangChain document loaders? A typical Retrieval‑Augmented Generation (RAG) pipeline follows these main steps: Collect source content (PDFs, DOCX, Markdown, CSVs) — often stored in Azure Blob Storage. Parse into text and associated metadata (i.e., represented as LangChain Document objects). Chunk + embed those documents and store in a vector store (e.g., Azure AI Search, Postgres pgvector, etc.). At query time, retrieve the most relevant chunks and feed them to an LLM as grounded context. LangChain document loaders make steps 1–2 turnkey and consistent so the rest of the stack (splitters, vector stores, retrievers) “just works”. See this LangChain RAG tutorial for a full example of these steps when building a RAG application in LangChain. How can the Azure Blob Storage document loader help? The langchain-azure-storage package offers the AzureBlobStorageLoader , a document loader that simplifies retrieving documents stored in Azure Blob Storage for use in a LangChain RAG application. Key benefits of the AzureBlobStorageLoader include: Flexible loading of Azure Storage blobs to LangChain Document objects. You can load blobs as documents from an entire container, a specific prefix within a container, or by blob names. Each document loaded corresponds 1:1 to a blob in the container. Lazy loading support for improved memory efficiency when dealing with large document sets. Documents can now be loaded one-at-a-time as you iterate over them instead of all at once. Automatically uses DefaultAzureCredential to enable seamless OAuth 2.0 authentication across various environments, from local development to Azure-hosted services. You can also explicitly pass your own credential (e.g., ManagedIdentityCredential , SAS token). Pluggable parsing. Easily customize how documents are parsed by providing your own LangChain document loader to parse downloaded blob content. Using the Azure Blob Storage document loader Installation To install the langchain-azure-storage package, run: pip install langchain-azure-storage Loading documents from a container To load all blobs from an Azure Blob Storage container as LangChain Document objects, instantiate the AzureBlobStorageLoader with the Azure Storage account URL and container name: from langchain_azure_storage.document_loaders import AzureBlobStorageLoader loader = AzureBlobStorageLoader( "https://<your-storage-account>.blob.core.windows.net/", "<your-container-name>" ) # lazy_load() yields one Document per blob for all blobs in the container for doc in loader.lazy_load(): print(doc.metadata["source"]) # The "source" metadata contains the full URL of the blob print(doc.page_content) # The page_content contains the blob's content decoded as UTF-8 text Loading documents by blob names To only load specific blobs as LangChain Document objects, you can additionally provide a list of blob names: from langchain_azure_storage.document_loaders import AzureBlobStorageLoader loader = AzureBlobStorageLoader( "https://<your-storage-account>.blob.core.windows.net/", "<your-container-name>", ["<blob-name-1>", "<blob-name-2>"] ) # lazy_load() yields one Document per blob for only the specified blobs for doc in loader.lazy_load(): print(doc.metadata["source"]) # The "source" metadata contains the full URL of the blob print(doc.page_content) # The page_content contains the blob's content decoded as UTF-8 text Pluggable parsing By default, loaded Document objects contain the blob's UTF-8 decoded content. To parse non-UTF-8 content (e.g., PDFs, DOCX, etc.) or chunk blob content into smaller documents, provide a LangChain document loader via the loader_factory parameter. When loader_factory is provided, the AzureBlobStorageLoader processes each blob with the following steps: Downloads the blob to a new temporary file Passes the temporary file path to the loader_factory callable to instantiate a document loader Uses that loader to parse the file and yield Document objects Cleans up the temporary file For example, below shows parsing PDF documents with the PyPDFLoader from the langchain-community package: from langchain_azure_storage.document_loaders import AzureBlobStorageLoader from langchain_community.document_loaders import PyPDFLoader # Requires langchain-community and pypdf packages loader = AzureBlobStorageLoader( "https://<your-storage-account>.blob.core.windows.net/", "<your-container-name>", prefix="pdfs/", # Only load blobs that start with "pdfs/" loader_factory=PyPDFLoader # PyPDFLoader will parse each blob as a PDF ) # Each blob is downloaded to a temporary file and parsed by PyPDFLoader instance for doc in loader.lazy_load(): print(doc.page_content) # Content parsed by PyPDFLoader (yields one Document per page in the PDF) This file path-based interface allows you to use any LangChain document loader that accepts a local file path as input, giving you access to a wide range of parsers for different file formats. Migrating from community document loaders to langchain-azure-storage If you're currently using AzureBlobStorageContainerLoader or AzureBlobStorageFileLoader from the langchain-community package, the new AzureBlobStorageLoader provides an improved alternative. This section provides step-by-step guidance for migrating to the new loader. Steps to migrate To migrate to the new Azure Storage document loader, make the following changes: Depend on the langchain-azure-storage package Update import statements from langchain_community.document_loaders to langchain_azure_storage.document_loaders . Change class names from AzureBlobStorageFileLoader and AzureBlobStorageContainerLoader to AzureBlobStorageLoader . Update document loader constructor calls to: Use an account URL instead of a connection string. Specify UnstructuredLoader as the loader_factory to continue to use Unstructured for parsing documents. Enable Microsoft Entra ID authentication in environment (e.g., run az login or configure managed identity) instead of using connection string authentication. Migration samples Below shows code snippets of what usage patterns look like before and after migrating from langchain-community to langchain-azure-storage : Before migration from langchain_community.document_loaders import AzureBlobStorageContainerLoader, AzureBlobStorageFileLoader container_loader = AzureBlobStorageContainerLoader( "DefaultEndpointsProtocol=https;AccountName=<account>;AccountKey=<account-key>;EndpointSuffix=core.windows.net", "<container>", ) file_loader = AzureBlobStorageFileLoader( "DefaultEndpointsProtocol=https;AccountName=<account>;AccountKey=<account-key>;EndpointSuffix=core.windows.net", "<container>", "<blob>" ) After migration from langchain_azure_storage.document_loaders import AzureBlobStorageLoader from langchain_unstructured import UnstructuredLoader # Requires langchain-unstructured and unstructured packages container_loader = AzureBlobStorageLoader( "https://<account>.blob.core.windows.net", "<container>", loader_factory=UnstructuredLoader # Only needed if continuing to use Unstructured for parsing ) file_loader = AzureBlobStorageLoader( "https://<account>.blob.core.windows.net", "<container>", "<blob>", loader_factory=UnstructuredLoader # Only needed if continuing to use Unstructured for parsing ) What's next? We're excited for you to try the new Azure Blob Storage document loader and would love to hear your feedback! Here are some ways you can help shape the future of langchain-azure-storage : Show support for interface stabilization - The document loader is currently in public preview and the interface may change in future versions based on feedback. If you'd like to see the current interface marked as stable, upvote the proposal PR to show your support. Report issues or suggest improvements - Found a bug or have an idea to make the document loaders better? File an issue on our GitHub repository. Propose new LangChain integrations - Interested in other ways to use Azure Storage with LangChain (e.g., checkpointing for agents, persistent memory stores, retriever implementations)? Create a feature request or write to us to let us know. Your input is invaluable in making langchain-azure-storage better for the entire community! Resources langchain-azure GitHub repository langchain-azure-storage PyPI package AzureBlobStorageLoader usage guide AzureBlobStorageLoader documentation referenceOrchestrating Multi-Agent Intelligence: MCP-Driven Patterns in Agent Framework
Building reliable AI systems requires modular, stateful coordination and deterministic workflows that enable agents to collaborate seamlessly. The Microsoft Agent Framework provides these foundations, with memory, tracing, and orchestration built in. This implementation demonstrates four multi-agentic patterns — Single Agent, Handoff, Reflection, and Magentic Orchestration — showcasing different interaction models and collaboration strategies. From lightweight domain routing to collaborative planning and self-reflection, these patterns highlight the framework’s flexibility. At the core is Model Context Protocol (MCP), connecting agents, tools, and memory through a shared context interface. Persistent session state, conversation thread history, and checkpoint support are handled via Cosmos DB when configured, with an in-memory dictionary as a default fallback. This setup enables dynamic pattern swapping, performance comparison, and traceable multi-agent interactions — all within a unified, modular runtime. Business Scenario: Contoso Customer Support Chatbot Contoso’s chatbot handles multi-domain customer inquiries like billing anomalies, promotion eligibility, account locks, and data usage questions. These require combining structured data (billing, CRM, security logs, promotions) with unstructured policy documents processed via vector embeddings. Using MCP, the system orchestrates tool calls to fetch real-time structured data and relevant policy content, ensuring policy-aligned, auditable responses without exposing raw databases. This enables the assistant to explain anomalies, recommend actions, confirm eligibility, guide account recovery, and surface risk indicators—reducing handle time and improving first-contact resolution while supporting richer multi-agent reasoning. Architecture & Core Concepts The Contoso chatbot leverages the Microsoft Agent Framework to deliver a modular, stateful, and workflow-driven architecture. At its core, the system consists of: Base Agent: All agent patterns—single agent, reflection, handoff and magentic orchestration—inherit from a common base class, ensuring consistent interfaces for message handling, tool invocation, and state management. Backend: A FastAPI backend manages session routing, agent execution, and workflow orchestration. Frontend: A React-based UI (or Streamlit alternative) streams responses in real-time and visualizes agent reasoning and tool calls. Modular Runtime and Pattern Swapping One of the most powerful aspects of this implementation is its modular runtime design. Each agentic pattern—Single, Reflection, Handoff, and Magnetic—plugs into a shared execution pipeline defined by the base agent and MCP integration. By simply updating the .env configuration (e.g., agent_module=handoff), developers can swap in and out entire coordination strategies without touching the backend, frontend, or memory layers. This makes it easy to compare agent styles side by side, benchmark reasoning behaviors, and experiment with orchestration logic—all while maintaining a consistent, deterministic runtime. The same MCP connectors, FastAPI backend, and Cosmos/in-memory state management work seamlessly across every pattern, enabling rapid iteration and reliable evaluation. # Dynamic agent pattern loading agent_module_path = os.getenv("AGENT_MODULE") agent_module = __import__(agent_module_path, fromlist=["Agent"]) Agent = getattr(agent_module, "Agent") # Common MCP setup across all patterns async def _create_tools(self, headers: Dict[str, str]) -> List[MCPStreamableHTTPTool] | None: if not self.mcp_server_uri: return None return [MCPStreamableHTTPTool( name="mcp-streamable", url=self.mcp_server_uri, headers=headers, timeout=30, request_timeout=30, )] Memory & State Management State management is critical for multi-turn conversations and cross-agent workflows. The system supports two out-of-the-box options: Persistent Storage (Cosmos DB) Acts as the durable, enterprise-ready backend. Stores serialized conversation threads and workflow checkpoints keyed by tenant and session ID. Ensures data durability and auditability across restarts. In-Memory Session Store Default fallback when Cosmos DB credentials are not configured. Maintains ephemeral state per session for fast prototyping or lightweight use cases. All patterns leverage the same thread-based state abstraction, enabling: Session isolation: Each user session maintains its own state and history. Checkpointing: Multi-agent workflows can snapshot shared and executor-local state at any point, supporting pause/resume and fault recovery. Model Context Protocol (MCP): Acts as the connector between agents and tools, standardizing how data is fetched and results are returned to agents, whether querying structured databases or unstructured knowledge sources. Core Principles Across all patterns, the framework emphasizes: Modularity: Components are interchangeable—agents, tools, and state stores can be swapped without disrupting the system. Stateful Coordination: Multi-agent workflows coordinate through shared and local state, enabling complex reasoning without losing context. Deterministic Workflows: While agents operate autonomously, the workflow layer ensures predictable, auditable execution of multi-agent tasks. Unified Execution: From single-agent Q&A to complex Magentic orchestrations, every agent follows the same execution lifecycle and integrates seamlessly with MCP and the state store. Multi-Agent Patterns: Workflow and Coordination With the architecture and core concepts established, we can now explore the agentic patterns implemented in the Contoso chatbot. Each pattern builds on the base agent and MCP integration but differs in how agents orchestrate tasks and communicate with one another to handle multi-domain customer queries. In the sections that follow, we take a deeper dive into each pattern’s workflow and examine the under-the-hood communication flows between agents: Single Agent – A simple, single-domain agent handling straightforward queries. Reflection Agent – Allows agents to introspect and refine their outputs. Handoff Pattern – Routes conversations intelligently to specialized agents across domains. Magentic Orchestration – Coordinates multiple specialist agents for complex, parallel tasks. For each pattern, the focus will be on how agents communicate and coordinate, showing the practical orchestration mechanisms in action. Single Intelligent Agent The Single Agent Pattern represents the simplest orchestration style within the framework. Here, a single autonomous agent handles all reasoning, decision-making, and tool interactions directly — without delegation or multi-agent coordination. When a user submits a request, the single agent processes the query using all tools, memory, and data sources available through the Model Context Protocol (MCP). It performs retrieval, reasoning, and response composition in a single, cohesive loop. Communication Flow: User Input → Agent: The user submits a question or command. Agent → MCP Tools: The agent invokes one or more tools (e.g., vector retrieval, structured queries, or API calls) to gather relevant context and data. Agent → User: The agent synthesizes the tool outputs, applies reasoning, and generates the final response to the user. Session Memory: Throughout the exchange, the agent stores conversation history and extracted entities in the configured memory store (in-memory or Cosmos DB). Key Communication Principles: Single Responsibility: One agent performs both reasoning and action, ensuring fast response times and simpler state management. Direct Tool Invocation: The agent has direct access to all registered tools through MCP, enabling flexible retrieval and action chaining. Stateful Execution: The session memory preserves dialogue context, allowing the agent to maintain continuity across user turns. Deterministic Behavior: The workflow is fully predictable — input, reasoning, tool call, and output occur in a linear sequence. Reflection pattern The Reflection Pattern introduces a lightweight, two-agent communication loop designed to improve the quality and reliability of responses through structured self-review. In this setup, a Primary Agent first generates an initial response to the user’s query. This draft is then passed to a Reviewer Agent, whose role is to critique and refine the response—identifying gaps, inaccuracies, or missed context. Finally, the Primary Agent incorporates this feedback and produces a polished final answer for the user. This process introduces one round of reflection and improvement without adding excessive latency, balancing quality with responsiveness. Communication Flow: User Input → Primary Agent: The user submits a query. Primary Agent → Reviewer Agent: The primary generates an initial draft and passes it to the reviewer. Reviewer Agent → Primary Agent: The reviewer provides feedback or suggested improvements. Primary Agent → User: The primary revises its response and sends the refined version back to the user. Key Communication Principles: Two-Stage Dialogue: Structured interaction between Primary and Reviewer ensures each output undergoes quality assurance. Focused Review: The Reviewer doesn’t recreate answers—it critiques and enhances, reducing redundancy. Stateful Context: Both agents operate over the same shared memory, ensuring consistency between draft and revision. Deterministic Flow: A single reflection round guarantees predictable latency while still improving answer quality. Transparent Traceability: Each step—initial draft, feedback, and final output—is logged, allowing developers to audit reasoning or assess quality improvements over time. In practice, this pattern enables the system to reason about its own output before responding, yielding clearer, more accurate, and policy-aligned answers without requiring multiple independent retries. Handoff Pattern When a user request arrives, the system first routes it through an Intent Classifier (or triage agent) to determine which domain specialist should handle the conversation. Once identified, control is handed off directly to that Specialist Agent, which uses its own tools, domain knowledge, and state context to respond. This specialist continues to handle the user interaction as long as the conversation stays within its domain. If the user’s intent shifts — for example, moving from billing to security — the conversation is routed back to the Intent Classifier, which re-assigns it to the correct specialist agent. This pattern reduces latency and maintains continuity by minimizing unnecessary routing. Each handoff is tracked through the shared state store, ensuring seamless context carry-over and full traceability of decisions. Key Communication Principles: Dynamic Routing: The Intent Classifier routes user input to the right specialist domain. Domain Persistence: The specialist remains active while the user stays within its domain. Context Continuity: Conversation history and entities persist across agents through the shared state store. Traceable Handoffs: Every routing decision is logged for observability and auditability. Low Latency: Responses are faster since domain-appropriate agents handle queries directly. In practice, this means a user could begin a conversation about billing, continue seamlessly, and only be re-routed when switching topics — without losing any conversational context or history. Magentic Pattern The Magentic Pattern is designed for open-ended, multi-faceted tasks that require multiple agents to collaborate. It introduces a Manager (Planner) Agent, which interprets the user’s goal, breaks it into subtasks, and orchestrates multiple Specialist Agents to execute those subtasks. The Manager creates and maintains a Task Ledger, which tracks the status, dependencies, and results of each specialist’s work. As specialists perform their tool calls or reasoning, the Manager monitors their progress, gathers intermediate outputs, and can dynamically re-plan, dispatch additional tasks, or adjust the overall workflow. When all subtasks are complete, the Manager synthesizes the combined results into a coherent final response for the user. Key Communication Principles: Centralized Orchestration: The Manager coordinates all agent interactions and workflow logic. Parallel and Sequential Execution: Specialists can work simultaneously or in sequence based on task dependencies. Task Ledger: Acts as a transparent record of all task assignments, updates, and completions. Dynamic Re-planning: The Manager can modify or extend workflows in real time based on intermediate findings. Shared Memory: All agents access the same state store for consistent context and result sharing. Unified Output: The Manager consolidates results into one response, ensuring coherence across multi-agent reasoning. In practice, Magentic orchestration enables complex reasoning where the system might combine insights from multiple agents — e.g., billing, product, and security — and present a unified recommendation or resolution to the user. Choosing the Right Agent for Your Use Case Selecting the appropriate agent pattern hinges on the complexity of the task and the level of coordination required. As use cases evolve from straightforward queries to intricate, multi-step processes, the need for specialized orchestration increases. Below is a decision matrix to guide your choice: Feature / Requirement Single Agent Reflection Agent Handoff Pattern Magentic Orchestration Handles simple, domain-bound tasks ✔ ✔ ✖ ✖ Supports review / quality assurance ✖ ✔ ✖ ✔ Multi-domain routing ✖ ✖ ✔ ✔ Open-ended / complex workflows ✖ ✖ ✖ ✔ Parallel agent collaboration ✖ ✖ ✖ ✔ Direct tool access ✔ ✔ ✔ ✔ Low latency / fast response ✔ ✔ ✔ ✖ Easy to implement / low orchestration ✔ ✔ ✖ ✖ Dive Deeper: Explore, Build, and Innovate We've explored various agent patterns, from Single Agent to Magentic Orchestration, each tailored to different use cases and complexities. To see these patterns in action, we invite you to explore our Github repo. Clone the repo, experiment with the examples, and adapt them to your own scenarios. Additionally, beyond the patterns discussed here, the repository also features a Human-in-the-Loop (HITL) workflow designed for fraud detection. This workflow integrates human oversight into AI decision-making, ensuring higher accuracy and reliability. For an in-depth look at this approach, we recommend reading our detailed blog post: Building Human-in-the-loop AI Workflows with Microsoft Agent Framework | Microsoft Community Hub Engage with these resources, and start building intelligent, reliable, and scalable AI systems today! This repository and content is developed and maintained by James Nguyen, Nicole Serafino, Kranthi Kumar Manchikanti, Heena Ugale, and Tim Sullivan.Study Buddy: Learning Data Science and Machine Learning with an AI Sidekick
If you've ever wished for a friendly companion to guide you through the world of data science and machine learning, you're not alone. As part of the "For Beginners" curriculum, I recently built a Study Buddy Agent, an AI-powered assistant designed to help learners explore data science interactively, intuitively, and joyfully. Why a Study Buddy? Learning something new can be overwhelming, especially when you're navigating complex topics like machine learning, statistics, or Python programming. The Study Buddy Agent is here to change that. It brings the curriculum to life by answering questions, offering explanations, and nudging learners toward deeper understanding, all in a conversational format. Think of it as your AI-powered lab partner: always available, never judgmental, and endlessly curious. Built with chatmodes, Powered by Purpose The agent lives inside a .chatmodes file in the https://github.com/microsoft/Data-Science-For-Beginners/blob/main/.github/chatmodes/study-mode.chatmode.md. This file defines how the agent behaves, what tone it uses, and how it interacts with learners. I designed it to be friendly, encouraging, and beginner-first—just like the curriculum itself. It’s not just about answering questions. The Study Buddy is trained to: Reinforce key concepts from the curriculum Offer hints and nudges when learners get stuck Encourage exploration and experimentation Celebrate progress and milestones What’s Under the Hood? The agent uses GitHub Copilot's chatmode, which allows developers to define custom behaviors for AI agents. By aligning the agent’s responses with the curriculum’s learning objectives, we ensure that learners stay on track while enjoying the flexibility of conversational learning. How You Can Use It YouTube Video here: Study Buddy - Data Science AI Sidekick Clone the repo: Head to the https://github.com/microsoft/Data-Science-For-Beginners and clone it locally or use Codespaces. Open the GitHub Copilot Chat, and select Study Buddy: This will activate the Study Buddy. Start chatting: Ask questions, explore topics, and let the agent guide you. What’s Next? This is just the beginning. I’m exploring ways to: Expand the agent to other beginner curriculums (Web Dev, AI, IoT) Integrate feedback loops so learners can shape the agent’s evolution Final Thoughts In my role, I believe learning should be inclusive, empowering, and fun. The Study Buddy Agent is a small step toward that vision, a way to make data science feel less like a mountain and more like a hike with a good friend. Try it out, share your feedback, and let’s keep building tools that make learning magical. Join us on Discord to share your feedback.From Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.¡Curso oficial y gratuito de GenAI y Python! 🚀
¿Quieres aprender a usar modelos de IA generativa en tus aplicaciones de Python?Estamos organizando una serie de nueve transmisiones en vivo, en inglés y español, totalmente dedicadas a la IA generativa. Vamos a cubrir modelos de lenguaje (LLMs), modelos de embeddings, modelos de visión, y también técnicas como RAG, function calling y structured outputs. Además, te mostraremos cómo construir Agentes y servidores MCP, y hablaremos sobre seguridad en IA y evaluaciones, para asegurarnos de que tus modelos y aplicaciones generen resultados seguros. 🔗 Regístrate para toda la serie. Además de las transmisiones en vivo, puedes unirte a nuestras office hours semanales en el AI Foundry Discord de para hacer preguntas que no se respondan durante el chat. ¡Nos vemos en los streams! 👋🏻 Here’s your HTML converted into clean, readable text format (perfect for a newsletter, blog post, or social media caption): Modelos de Lenguaje 📅 7 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor ¡Únete a la primera sesión de nuestra serie de Python + IA! En esta sesión, hablaremos sobre los Modelos de Lenguaje (LLMs), los modelos que impulsan ChatGPT y GitHub Copilot. Usaremos Python para interactuar con LLMs utilizando paquetes como el SDK de OpenAI y Langchain. Experimentaremos con prompt engineering y ejemplos few-shot para mejorar los resultados. También construiremos una aplicación full stack impulsada por LLMs y explicaremos la importancia de la concurrencia y el streaming en apps de IA orientadas al usuario. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Embeddings Vectoriales 📅 8 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la segunda sesión de Python + IA, exploraremos los embeddings vectoriales, una forma de codificar texto o imágenes como arrays de números decimales. Estos modelos permiten realizar búsquedas por similitud en distintos tipos de contenido. Usaremos modelos como la serie text-embedding-3 de OpenAI, visualizaremos resultados en Python y compararemos métricas de distancia. También veremos cómo aplicar cuantización y cómo usar modelos multimodales de embedding. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Recuperación-Aumentada Generación (RAG) 📅 9 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la tercera sesión, exploraremos RAG, una técnica que envía contexto al LLM para obtener respuestas más precisas dentro de un dominio específico. Usaremos distintas fuentes de datos —CSVs, páginas web, documentos, bases de datos— y construiremos una app RAG full-stack con Azure AI Search. Modelos de Visión 📅 14 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor ¡La cuarta sesión trata sobre modelos de visión como GPT-4o y 4o-mini! Estos modelos pueden procesar texto e imágenes, generando descripciones, extrayendo datos, respondiendo preguntas o clasificando contenido. Usaremos Python para enviar imágenes a los modelos, crear una app de chat con imágenes e integrarlos en flujos RAG. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Salidas Estructuradas 📅 15 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la quinta sesión aprenderemos a hacer que los LLMs generen respuestas estructuradas según un esquema. Exploraremos el modo structured outputs de OpenAI y cómo aplicarlo para extracción de entidades, clasificación y flujos con agentes. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Calidad y Seguridad 📅 16 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la sexta sesión hablaremos sobre cómo usar IA de manera segura y evaluar la calidad de las salidas. Mostraremos cómo configurar Azure AI Content Safety, manejar errores en código Python y evaluar resultados con el SDK de Evaluación de Azure AI. Tool Calling 📅 21 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la última semana de la serie, nos enfocamos en tool calling (function calling), la base para construir agentes de IA. Aprenderemos a definir herramientas en Python o JSON, manejar respuestas de los modelos y habilitar llamadas paralelas y múltiples iteraciones. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Agentes de IA 📅 22 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor ¡En la penúltima sesión construiremos agentes de IA! Usaremos frameworks como Langgraph, Semantic Kernel, Autogen, y Pydantic AI. Empezaremos con ejemplos simples y avanzaremos a arquitecturas más complejas como round-robin, supervisor, graphs y ReAct. Model Context Protocol (MCP) 📅 23 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor Cerramos la serie con Model Context Protocol (MCP), la tecnología abierta más candente de 2025. Aprenderás a usar FastMCP para crear un servidor MCP local y conectarlo a chatbots como GitHub Copilot. También veremos cómo integrar MCP con frameworks de agentes como Langgraph, Semantic Kernel y Pydantic AI. Y, por supuesto, hablaremos sobre los riesgos de seguridad y las mejores prácticas para desarrolladores. ¿Querés que lo reformatee para publicación en Markdown (para blogs o repos) o en texto plano con emojis y separadores estilo redes sociales?Swagger Auto-Generation on MCP Server
Would you like to generate a swagger.json directly on an MCP server on-the-fly? In many use cases, using remote MCP servers is not uncommon. In particular, if you're using Azure API Management (APIM), Azure API Center (APIC) or Copilot Studio in Power Platform, integrating with remote MCP servers is inevitable.