rss.livelink.threads-in-node

Hands-on Session: From idea to interactive lesson with Microsoft Learning Zone

MikeTholfsen — Wed, 29 Apr 2026 15:15:15 GMT

Join us on Tuesday, May 12th at 8:00 AM Pacific for a hands-on professional development session introducing Learning Zone - a new app that helps you create interactive, classroom-ready lessons in minutes. In this 45-minute webinar, the Product Management team will guide you through core capabilities and the latest updates. You can follow along using your own Microsoft 365 Education account.

What we will cover:

✅ Getting started with Learning Zone: Access Learning Zone and get set up

✅ Experience as a student: Join a session and see how it works from the student perspective

✅ Building your first interactive lesson: Create your first interactive lesson (in minutes!)

✅ Assigning to your class: Send lessons via link, short code, Teams Assignments, or your LMS

✅ Exploring the ready-to-learn library: Bring immediate value to your students through a variety of lessons by trusted of partners.

Important note: Lesson generation is currently available only on Copilot+ PCs with any Microsoft 365 Education license (supported in English and Spanish). No Copilot+ PC? No problem. You’ll still get to try out the student experience, learn how to use the lesson library, assign interactive lessons, review insights, and integrate Learning Zone into your existing workflows.

📅 Date: Tuesday, May 12th
⏰ Time: 8:00 AM Pacific

Register: https://aka.ms/LZwebinarMay26

We look forward to having you attend the event!

microsoft execel

stuartabaho — Wed, 29 Apr 2026 12:42:19 GMT

please members am new here any one can help me in microsoft word and execel

Build AI RAG Apps with LangChain, Azure DocumentDB and Microsoft Foundry: Step-by-Step Guide

JohnAziz — Mon, 27 Apr 2026 08:41:21 GMT

Scenario

Imagine you are building your company’s RAG chat application using Microsoft Foundry - Azure OpenAI and orchestrating the flow with LangChain. The chat experience works, but now it needs to be grounded in your company’s data. You generate embeddings and want to store and query them without adding another database or complex sync pipeline. Instead of stitching services together, you use Azure DocumentDB (with MongoDB compatibility) with built-in vector search to store your JSON data and embeddings in one place. You deploy the app to Azure App Service and quickly compare vector search alone versus a full RAG pipeline, sharing it with your team for testing.

What will you learn?

In this blog, you'll learn to:

Create an Azure DocumentDB (with MongoDB compatibility) resource.
Create an embeddings and a chat deployment in Microsoft Foundry Azure OpenAI portal.
Create an Azure App Service website with continuous deployment from GitHub.
Configure Azure App Service application settings to enable communication between Azure resources.
Configure GitHub workflow to work successfully.

What is the main objective?

Build AI Powered RAG Application using LangChain, Microsoft Foundry Azure OpenAI, and Azure DocumentDB (with MongoDB compatibility): Step-by-Step Guide

Prerequisites

An Azure subscription.
- If you don’t already have one, you can sign up for an Azure free account.
- For students, you can use the free Azure for Students offer which doesn’t require a credit card only your school email.
A GitHub account.

Summary of the steps:

Step 1: Create an Azure DocumentDB (with MongoDB compatibility) resource
Step 2: Create a Microsoft Foundry - Azure OpenAI resource and Deploy chat and embedding Models
Step 3: Create an Azure App Service and Deploy the RAG Chat Application

Step 1: Create an Azure DocumentDB (with MongoDB compatibility) resource

In this step, you'll:

Open the Azure Portal.
Create an Azure DocumentDB (with MongoDB compatibility) resource.

Open the Azure Portal

1. Visit the Azure Portal https://portal.azure.com in your browser and sign in.

Now you are inside the Azure portal!

Create a new Azure DocumentDB (with MongoDB compatibility) resource

In this step, you create an Azure DocumentDB (with MongoDB compatibility) resource to store your data, vector embedding, and perform vector search.

1. Type documentdb in the search bar at the top of the portal page and select Azure DocumentDB (with MongoDB compatibility) from the available options.

2. Select Create from the toolbar to start provisioning your new cluster.

3. Add the following information to create a resource:

What	Value
Subscription	Use your preferred subscription. It's advised to use the same subscription across all the resources that communicate with each other on Azure.
Resource group	Select Create new to create a new resource group. Enter a unique name for the resource group.
Cluster name	Enter a globally unique name.
Location	Select a region close to you for the best response time. For example, Select UK South.
MongoDB version	Select the latest available version of MongoDB

4. Select Configure to configure your cluster tier.

5. Add the following information to configure the cluster tier. You can scale it up later:

What	Value
Cluster tier	Select M25 tier, 2 (Burstable) vCores.
Storage	Select 32 GiB.

6. Select Save.

7. Enter the cluster Admin Username and Password and store them in a secure location.

8. Select Next to configure the networking settings.

9. Select Allow Public Access from Azure services and resources within the Azure to this cluster.

10. Select Add current IP address to the firewall rules to allow local access to the cluster.

11. Select Review + create.

12. Confirm your configuration settings and select Create to start provisioning the resource.

Note: The cluster creation can take up to 10 minutes. It's recommended to move on with the rest of the steps and get back to it later.

Step 2: Create a Microsoft Foundry - Azure OpenAI resource and Deploy chat and embedding Models

In this step, you'll:

Create a Microsoft Foundry Azure OpenAI resource.
Create chat and embedding model deployments.

Create an Azure OpenAI resource

In this step, you create an Azure OpenAI Service resource that enables you to interact with different large language models (LLMs).

1. Type openai in the search bar at the top of the portal page and select Azure OpenAI from the available options.

2. Select Create from the toolbar then select Azure OpenAI to provision a new Azure OpenAI resource.

3. Add the following information to create a resource:

What	Value
Subscription	Use the same subscription you used to apply for Azure OpenAI access.
Resource group	Use the resource group you created in the previous step.
Region	Select a region close to you for the best response time. For example, Select UK South.
Name	Enter a globally unique name.
Pricing tier	Select S0. Currently, this is the only available pricing tier.

4. Now that the basic information is added, select Next to confirm your details and proceed to the next page.

5. Select Next to confirm your network details.

6. Select Next to confirm your tag details.

7. Confirm your configuration settings and select Create to start provisioning the resource. Wait for the deployment to finish.

8. After the deployment finishes, select Go to resource to inspect your created resource. Here, you can manage your resource and find important information like the endpoint URL and API keys.

Create chat and embedding model deployments

In this step, you create an Azure OpenAI embedding model deployment and a chat model deployment. Creating a deployment on your previously provisioned resource allows you to generate text embeddings (i.e. numerical representation for text) and have a natural language conversation with your data.

1. Select Go to Foundry portal from the toolbar to open the studio.

2. Select Deployments from the Shared resources left side menu to go to the deployments tab.

3. Select + Deploy model from the toolbar then select Deploy base model from the options. A Deploy model window opens.

4. Type gpt-4o-mini to search for the model then select it then select Use model.

5. Select Continue with existing setup to proceed to next step.

6. Refresh page and repeat previous steps to select the model then select Confirm.

7. Review selected options then select Deploy.

8. Select + Deploy model from the toolbar then select Deploy base model from the options. A Deploy model window opens.

9. Type text-embedding-3-small to search for the model then select it then select Confirm.

10. Review selected options then select Deploy.

Step 3: Create an Azure App Service and Deploy the RAG Chat Application

In this step, you'll:

Fork the sample repository on GitHub.
Create an Azure App Service resource with a deployment from GitHub.
Modify Azure App Service Application settings in the Azure portal.
Configure the workflow to deploy your application from GitHub.
Test the website before and after adding the data.

Fork the Sample Repository on GitHub

In this step, you create a copy from the source code on your GitHub account to be able to edit it and use it later.

1. Visit the sample github.com/Azure-Samples/Cosmic-Food-RAG-app in your browser and sign in.

2. Select Fork from the top of the sample page.

3. Select an owner for the fork then, select Create fork.

Create an Azure App Service resource with a deployment from GitHub

In this step, you create an Azure App service resource and connect it with your GitHub account to deploy a Python application.

1. Type app service in the search bar at the top of the portal page and select App Services from the available options.

2. Select Create Web App from the toolbar to start provisioning a new web application.

3. Add the following information to fill in the basic configuration of the application:

What	Value
Subscription	Use the same subscription you used to apply for Azure OpenAI access.
Resource group	Use the same resource group you created before.
Name	Enter a unique name for your website. For example, cosmic-food-rag.
Publish?	Select Code. This option specifies whether your deployment consists of code or a container.
Runtime stack	Select Python 3.12.
Operating System	Select Linux.
Region	Select UK South. This is the region where the rest of the resources you created reside.

4. Add the following information to create the app service plan. You can scale it up later:

What	Value
Linux Plan	Select a pre-existing plan or create a new plan.
Pricing Plan	Select Basic B1.

5. Select Deployment from the toolbar to move to the deployment configuration tab.

6. Add the following information to enable continuous deployment from GitHub:

What	Value
Continuous deployment	Select Enable.
GitHub account	Select your GitHub account.
Organization	Select your organization. If you are using your personal account then select it.
Repository	Select Cosmic-Food-RAG-app.
Branch	Select main.

7. Select Review + create.

8. Confirm your configuration settings and select Create to start provisioning the resource. Wait for the deployment to finish.

9. After the deployment finishes, select Go to resource to inspect your created resource. Here, you can manage your resource and find important information like the application settings and logs.

Modify Azure App service Application settings in the Azure portal

In this step, you configure the Application settings to make the website able to communicate with other cloud resources.

1. In the Web App resource, select Environment variables from the left side menu.

2. Select + Add to add new environment variables to the function configuration.

3. Add the following names and values one by one and select Ok. Make sure to add your own values.

These application settings are for the Azure OpenAI resources that you created:

What	Value
OPENAI_API_VERSION	2024-10-21
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME	gpt-4o-mini
AZURE_OPENAI_CHAT_MODEL_NAME	gpt-4o-mini
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME	text-embedding-3-small
AZURE_OPENAI_EMBEDDINGS_MODEL_NAME	text-embedding-3-small
AZURE_OPENAI_EMBEDDINGS_DIMENSIONS	1536
AZURE_OPENAI_DEPLOYMENT_NAME	<azureOpenAiResourceName>
AZURE_OPENAI_ENDPOINT	https://<azureOpenAiResourceName>.openai.azure.com/
AZURE_OPENAI_API_KEY	<azureOpenAiResourceKey>

You can get the Azure OpenAI key from the Azure OpenAI resource page.

Select Keys and Endpoint from the Resource Management section and copy any of the available keys.

These application settings are for Azure DocumentDB (with MongoDB compatibility):

AZURE_COSMOS_USERNAME	<documentUsername>
AZURE_COSMOS_PASSWORD	<documentPassword>
AZURE_COSMOS_CONNECTION_STRING	mongodb+srv://<user>:<password>@<clusterName>.global.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000

You can get the DocumentDB connection string from the Azure DocumentDB (with MongoDB compatibility) resource page.

Select Connection strings and copy the connection string. Make sure to replace the user and password with the ones you created.

These application settings are new and are used for resources that will be created when the application starts you can use any value for them:

AZURE_COSMOS_DATABASE_NAME	<documentDatabaseName> ex. CosmicDB
AZURE_COSMOS_COLLECTION_NAME	<documentContainerName> ex. CosmicFoodCollection
AZURE_COSMOS_INDEX_NAME	<documentIndexName> ex. CosmicIndex

4. Select Apply to save your newly added environment variables.

5. Select Configuration then Stack settings to edit the application startup command.

6. Type entrypoint.sh in the startup command field then select Apply.

Configure the Workflow to deploy your application from GitHub

In this step, you modify the GitHub deployment workflow to point to the folder that contains the application.

1. Visit your forked repository on GitHub and notice the failing workflow.

2. Open the workflow file .github/workflows/main_cosmic-food-rag.yml.

3. Open the file and select the pen icon to edit it.

4. Modify line 41 from . to src/.

5. Remove the optional Local Build Section since the application already has tests that cover this part.

6. Add this section to Install Node 22 and build the static frontend.

7. Select Commit changes, and review your commit message and description. Select Commit changes.

The final workflow file should look like this:

# Docs for the Azure Web Apps Deploy action: https://github.com/Azure/webapps-deploy # More GitHub Actions for Azure: https://github.com/Azure/actions # More info on Python, GitHub Actions, and Azure App Service: https://aka.ms/python-webapps-actions name: Build and deploy Python app to Azure Web App - cosmic-food-rag on: push: branches: - main workflow_dispatch: jobs: build: runs-on: ubuntu-latest permissions: contents: read #This is required for actions/checkout steps: - uses: actions/checkout@v4 - name: Set up Node 22 uses: actions/setup-node@v6 with: node-version: 22 - name: Install Node Packages & Build Static Site run: cd frontend && npm install && npm run build # By default, when you enable GitHub CI/CD integration through the Azure portal, the platform automatically sets the SCM_DO_BUILD_DURING_DEPLOYMENT application setting to true. This triggers the use of Oryx, a build engine that handles application compilation and dependency installation (e.g., pip install) directly on the platform during deployment. Hence, we exclude the antenv virtual environment directory from the deployment artifact to reduce the payload size. - name: Upload artifact for deployment jobs uses: actions/upload-artifact@v4 with: name: python-app path: | src/ !antenv/ # 🚫 Opting Out of Oryx Build # If you prefer to disable the Oryx build process during deployment, follow these steps: # 1. Remove the SCM_DO_BUILD_DURING_DEPLOYMENT app setting from your Azure App Service Environment variables. # 2. Refer to sample workflows for alternative deployment strategies: https://github.com/Azure/actions-workflow-samples/tree/master/AppService deploy: runs-on: ubuntu-latest needs: build permissions: id-token: write #This is required for requesting the JWT contents: read #This is required for actions/checkout steps: - name: Download artifact from build job uses: actions/download-artifact@v4 with: name: python-app - name: Login to Azure uses: azure/login@v2 with: client-id: ${{ secrets.AZUREAPPSERVICE_CLIENTID_5672547ED09F46D59DD431ACF5A29F28 }} tenant-id: ${{ secrets.AZUREAPPSERVICE_TENANTID_0059913572C8467882D3999D0E0DD5B8 }} subscription-id: ${{ secrets.AZUREAPPSERVICE_SUBSCRIPTIONID_7C42E3352C5D47F084CB0CD14F549D27 }} - name: 'Deploy to Azure Web App' uses: azure/webapps-deploy@v3 id: deploy-to-webapp with: app-name: 'cosmic-food-rag' slot-name: 'Production'

8. Select Actions to review the workflow run status.

Test the website before and After adding the data

In this step, you test the application before adding the data, add the data, and test again.

1. Select the workflow name to open it and get the website URL.

2. Select any of the suggested messages or type your own and it should respond with No results found.

3. Navigate to your Azure App Service resource page and select SSH then select Go to open a new SSH page.

4. In the SSH terminal, run these commands:

uv sync --active

uv run --active ./scripts/add_data.py --file="./data/food_items.json"

5. Navigate back to the live website and type in the chat message Do you have any vegan food dishes? and it should respond with the correct answer now.

Congratulations!! You successfully built the full application.

Clean Up

Once you finish experimenting on Microsoft Azure you might want to delete the resources to not consume any more money from your subscription.

You can delete the resource group and it will delete everything inside it or delete the resources one by one that's totally up to you.

Conclusion

Congratulations! You've learned how to create an Azure DocumentDB (with MongoDB compatibility) cluster, how to create a Microsoft Foundry - Azure OpenAI resource, how to deploy an embedding model and a chat model from the Foundry portal, how to create an Azure App Service and configure continuous deployment with GitHub, and how to modify application settings to enable the communication across Azure resources. By using these technologies, you can build a RAG chat application with the option to perform vector search too over your own data and provide grounded (relevant) responses.

Next steps

Documentation

Training Content

Develop generative AI apps in Azure

Found this useful? Share it with others and follow me to get updates on:

Twitter (twitter.com/john00isaac)
LinkedIn (linkedin.com/in/john0isaac)

Feel free to share your comments and/or inquiries in the comment section below..
See you in future demos!

From Demo to Production: Building Microsoft Foundry Hosted Agents with .NET

Lee_Stott — Wed, 22 Apr 2026 17:30:00 GMT

The Gap Between a Demo and a Production Agent

Let's be honest. Getting an AI agent to work in a demo takes an afternoon. Getting it to work reliably in production, tested, containerised, deployed, observable, and maintainable by a team. is a different problem entirely.

Most tutorials stop at the point where the agent prints a response in a terminal. They don't show you how to structure your code, cover your tools with tests, wire up CI, or deploy to a managed runtime with a proper lifecycle. That gap between prototype and production is where developer teams lose weeks.

Microsoft Foundry Hosted Agents close that gap with a managed container runtime for your own custom agent code. And the Hosted Agents Workshop for .NET gives you a complete, copy-paste-friendly path through the entire journey. from local run to deployed agent to chat UI, in six structured labs using .NET 10.

This post walks you through what the workshop delivers, what you will build, and why the patterns it teaches matter far beyond the workshop itself.

What Is a Microsoft Foundry Hosted Agent?

Microsoft Foundry supports two distinct agent types, and understanding the difference is the first decision you will make as an agent developer.

Prompt agents are lightweight agents backed by a model deployment and a system prompt. No custom code required. Ideal for simple Q&A, summarisation, or chat scenarios where the model's built-in reasoning is sufficient.
Hosted agents are container-based agents that run your own code .NET, Python, or any framework you choose inside Foundry's managed runtime. You control the logic, the tools, the data access, and the orchestration.

When your scenario requires custom tool integrations, deterministic business logic, multi-step workflow orchestration, or private API access, a hosted agent is the right choice. The Foundry runtime handles the managed infrastructure; you own the code.

For the official deployment reference, see Deploy a hosted agent to Foundry Agent Service on Microsoft Learn.

What the Workshop Delivers

The Hosted Agents Workshop for .NET is a beginner-friendly, hands-on workshop that takes you through the full development and deployment path for a real hosted agent. It is structured around a concrete scenario: a Hosted Agent Readiness Coach that helps delivery teams answer questions like:

Should this use case start as a prompt agent or a hosted agent?
What should a pilot launch checklist include?
How should a team troubleshoot common early setup problems?

The scenario is purposefully practical. It is not a toy chatbot. It is the kind of tool a real team would build and hand to other engineers, which means it needs to be testable, deployable, and extensible.

The workshop covers:

Local development and validation with .NET 10
Copilot-assisted coding with repo-specific instructions
Deterministic tool implementation with xUnit test coverage
CI pipeline validation with GitHub Actions
Secure deployment to Azure Container Registry and Microsoft Foundry
Chat UI integration using Blazor

What You Will Build

By the end of the workshop, you will have a code-based hosted agent that exposes an OpenAI Responses-compatible /responses endpoint on port 8088.

The agent is backed by three deterministic local tools, implemented in WorkshopLab.Core:

RecommendImplementationShape — analyses a scenario and recommends hosted or prompt agent based on its requirements
BuildLaunchChecklist — generates a pilot launch checklist for a given use case
TroubleshootHostedAgent — returns structured troubleshooting guidance for common setup problems

These tools are deterministic by design, no LLM call required to return a result. That choice makes them fast, predictable, and fully testable, which is the right architecture for business logic in a production agent.

The end-to-end architecture looks like this:

The Hands-On Journey: Lab by Lab

The workshop follows a deliberate build → validate → ship progression. Each lab has a clear outcome. You do not move forward until the previous checkpoint passes.

Lab 0 — Setup and Local Run

Open the repo in VS Code or a GitHub Codespace, configure your Microsoft Foundry project endpoint and model deployment name, then run the agent locally. By the end of Lab 0, your agent is listening on http://localhost:8088/responses and responding to test requests.

dotnet restore
dotnet build
dotnet run --project src/WorkshopLab.AgentHost

Test it with a single PowerShell call:

Invoke-RestMethod -Method Post `
    -Uri "http://localhost:8088/responses" `
    -ContentType "application/json" `
    -Body '{"input":"Should we start with a hosted agent or a prompt agent?"}'

Lab 0 instructions →

Lab 1 — Copilot Customisation

Configure repo-specific GitHub Copilot instructions so that Copilot understands the hosted-agent patterns used in this project. You will also add a Copilot review skill tailored to hosted agent code reviews. This step means every code suggestion you receive from Copilot is contextualised to the workshop scenario rather than giving generic .NET advice.

Lab 1 instructions →

Lab 2 — Tool Implementation

Extend one of the deterministic tools in WorkshopLab.Core with a real feature change. The suggested change adds a stronger recommendation path to RecommendImplementationShape for scenarios that require all three hosted-agent strengths simultaneously.

// In RecommendImplementationShape — add before the final return:
if (requiresCode && requiresTools && requiresWorkflow)
{
    return string.Join(Environment.NewLine,
    [
        $"Recommended implementation: Hosted agent (full-stack)",
        $"Scenario goal: {goal}",
        "Why: the scenario requires custom code, external tool access, and " +
        "multi-step orchestration — all three hosted-agent strengths.",
        "Suggested next step: start with a code-based hosted agent, register " +
        "local tools for each integration, and add a workflow layer."
    ]);
}

You then write an xUnit test to cover it, run dotnet test, and validate the change against a live /responses call. This is the workshop's most important teaching moment: every tool change is covered by a test before it ships.

Lab 2 instructions →

Lab 3 — CI Validation

Wire up a GitHub Actions workflow that builds the solution, runs the test suite, and validates that the agent container builds cleanly. No manual steps — if a change breaks the build or a test, CI catches it before any deployment happens.

Lab 3 instructions →

Lab 4 — Deployment to Microsoft Foundry

Use the Azure Developer CLI (azd) to provision an Azure Container Registry, publish the agent image, and deploy the hosted agent to Microsoft Foundry. The workshop separates provisioning from deployment deliberately: azd owns the Azure resources; the Foundry control plane deployment is an explicit, intentional final step that depends on your real project endpoint and agent.yaml manifest values.

Lab 4 instructions →

Lab 5 — Chat UI Integration

Connect a Blazor chat UI to the deployed hosted agent and validate end-to-end responses. By the end of Lab 5, you have a fully functioning agent accessible through a real UI, calling your deterministic tools via the Foundry control plane.

Lab 5 instructions →

Key Concepts to Take Away

The workshop teaches concrete patterns that apply well beyond this specific scenario.

Code-first agent design

Prompt-only agents are fast to build but hard to test and reason about at scale. A hosted agent with code-backed tools gives you something you can unit test, refactor, and version-control like any other software.

Deterministic tools and testability

The workshop explicitly avoids LLM calls inside tool implementations. Deterministic tools return predictable outputs for a given input, which means you can write fast, reliable unit tests for them. This is the right pattern for business logic. Reserve LLM calls for the reasoning layer, not the execution layer.

CI/CD for agent systems

AI agents are software. They deserve the same build-test-deploy discipline as any other service. Lab 3 makes this concrete: you cannot ship without passing CI, and CI validates the container as well as the unit tests.

Deployment separation

The workshop's split between azd provisioning and Foundry control-plane deployment is not arbitrary. It reflects the real operational boundary: your Azure resources are long-lived infrastructure; your agent deployment is a lifecycle event tied to your project's specific endpoint and manifest. Keeping them separate reduces accidents and makes rollbacks easier.

Observability and the validation mindset

Every lab ends with an explicit checkpoint. The culture the workshop builds is: prove it works before moving on. That mindset is more valuable than any specific tool or command in the labs.

Why Hosted Agents Are Worth the Investment

The managed runtime in Microsoft Foundry removes the infrastructure overhead that makes custom agent deployment painful. You do not manage Kubernetes clusters, configure ingress rules, or handle TLS termination. Foundry handles the hosting; you handle the code.

This matters most for teams making the transition from demo to production. A prompt agent is an afternoon's work. A hosted agent with proper CI, tested tools, and a deployment pipeline is a week's work done properly once, instead of several weeks of firefighting done poorly repeatedly.

The Foundry agent lifecycle —> create, update, version, deploy —>also gives you the controls you need to manage agents in a real environment: staged rollouts, rollback capability, and clear separation between agent versions. For the full deployment guide, see Deploy a hosted agent to Foundry Agent Service.

From Workshop to Real Project

This workshop is not just a learning exercise. The repository structure, the tooling choices, and the CI/CD patterns are a reference implementation.

The patterns you can lift directly into a production project include:

The WorkshopLab.Core / WorkshopLab.AgentHost separation between business logic and agent hosting
The agent.yaml manifest pattern for declarative Foundry deployment
The GitHub Actions workflow structure for build, test, and container validation
The azd + ACR pattern for image publishing without requiring Docker Desktop locally
The Blazor chat UI as a starting point for internal tooling or developer-facing applications

The scenario, a readiness coach for hosted agents. This is also something teams evaluating Microsoft Foundry will find genuinely useful. It answers exactly the questions that come up when onboarding a new team to the platform.

Common Mistakes When Building Hosted Agents

Having run workshops and spoken with developer teams building on Foundry, a few patterns come up repeatedly:

Skipping local validation before containerising. Always validate the /responses endpoint locally first. Debugging inside a container is slower and harder than debugging locally.
Putting business logic inside the LLM call. If the answer to a user query can be determined by code, use code. Reserve the model for reasoning, synthesis, and natural language output.
Treating CI as optional. Agent code changes break things just like any other code change. If you do not have CI catching regressions, you will ship them.
Conflating provisioning and deployment. Recreating Azure resources on every deploy is slow and error-prone. Provision once with azd; deploy agent versions as needed through the Foundry control plane.
Not having a rollback plan. The Foundry agent lifecycle supports versioning. Use it. Know how to roll back to a previous version before you deploy to production.

Get Started

The workshop is open source, beginner-friendly, and designed to be completed in a single day. You need a .NET 10 SDK, an Azure subscription, access to a Microsoft Foundry project, and a GitHub account.

Clone the repository, follow the labs in order, and by the end you will have a production-ready reference implementation that your team can extend and adapt for real scenarios.

Clone the workshop repository →

Here is the quick start to prove the solution works locally before you begin the full lab sequence:

git clone https://github.com/microsoft/Hosted_Agents_Workshop_dotNET.git
cd Hosted_Agents_Workshop_dotNET

# Set your Foundry project endpoint and model deployment
$env:AZURE_AI_PROJECT_ENDPOINT = "https://<resource>.services.ai.azure.com/api/projects/<project>"
$env:MODEL_DEPLOYMENT_NAME     = "gpt-4.1-mini"

# Build and run
dotnet restore
dotnet build
dotnet run --project src/WorkshopLab.AgentHost

Then send your first request:

Invoke-RestMethod -Method Post `
    -Uri "http://localhost:8088/responses" `
    -ContentType "application/json" `
    -Body '{"input":"Should we start with a hosted agent or a prompt agent?"}'

When the agent answers as a Hosted Agent Readiness Coach, you are ready to begin the labs.

Key Takeaways

Hosted agents in Microsoft Foundry let you run custom .NET code in a managed container runtime — you own the logic, Foundry owns the infrastructure.
Deterministic tools are the right pattern for business logic in production agents: fast, testable, and predictable.
CI/CD is not optional for agent systems. Build it in from the start, not as an afterthought.
Separate your provisioning (azd) from your deployment (Foundry control plane) — it reduces accidents and simplifies rollbacks.
The workshop is a reference implementation, not just a tutorial. The patterns are production-grade and ready to adapt.

References

Building an Auditable Security Layer for Agentic AI

hazem — Wed, 22 Apr 2026 08:06:05 GMT

Most agent failures do not look like breaches.

They look like a normal chat, a normal answer, and a normal tool call. Until the next morning, when a single question collapses the whole story: who authorized that action.

You think you deployed an agent. In reality, you deployed an unbounded automation pipeline that happens to speak English.

I’m Hazem Ali — Microsoft AI MVP, Distinguished AI & ML Architect, Founder & CEO at Skytells. For over 20 years, I’ve built secure, scalable enterprise AI across cloud and edge, with a focus on agent security and sovereign, governed AI architectures. My work on these systems is widely referenced by practitioners across multiple regions.

Hazem Ali honored to receive an official speaker invitation under the patronage of H.H. Sheikh Dr. Sultan bin Muhammad Al Qasimi, Member of the UAE Supreme Council and Ruler of Sharjah, to speak at the Sharjah International Conference on Linguistic Intelligence (SICLI), organized by the American University of Sharjah (AUS) and the Emirates Scholar Center for Research and Studies.

This piece is a collaboration with Hammad Atta a Practice Lead – AI Security & Cloud Strategy and Dr. Yasir Mehmood , Dr Muhammad Zeeshan Baig, Dr. Muhammad Aatif, Dr. MUHAMMAD AZIZ UL HAQ. We align on one core idea: agent security is not about making the model behave. It is about building enforceable boundaries around the model and proving every privileged step.

This article is meant to sit next to my earlier Tech Community piece, Zero-Trust Agent Architecture: How To Actually Secure Your Agents, and go one level deeper into the mechanics you can implement on Azure today.

Let me break it down.

The Principle: The model is not your boundary

Let me break it down in the way I’d explain it in a design review.

A boundary is something that still holds when the component on the other side is adversarial, confused, or simply wrong. An LLM is none of those reliably. In an agent, the model is not just a generator. It becomes a planner and scheduler.

It decides when to retrieve, which tool to call, how to shape arguments, and when to loop.

That means your real attack surface is not “bad output.” It is the control-flow graph the model is allowed to traverse.

So if your “security” lives inside the prompt, you are putting policy in the same token stream the attacker can influence. That is not a boundary. That is a suggestion.

The only stable design is to treat the model like an untrusted proposer and the runtime like the verifier.

Here is the chain I use. Each gate is external to the model and survives manipulation.

Context Gate: Everything that enters the model is treated as executable influence, not “text.”
Capability Gate: Tools are invoked as constrained capabilities, not free-form function calls.
Evidence Gate: Every privileged step produces a verifiable artifact, not a story.
Retrieval Control Plane: What the agent can see is governed by labels and identity, not prompt etiquette.
Detection Layer: Drift and probing become alerts, not surprises.

Figure: Model proposes. Runtime verifies. Input + retrieved context → shields → model → tool gateway → signed intent → governed retrieval → SOC telemetry.

Now the rare part, the part most people miss: the boundary is not “block or allow.” The boundary is stateful. Once the runtime sees a suspicious signal, the entire session must transition into a degraded capability state, and every downstream gate must enforce that state.

1. Treat context as executable influence, and preserve provenance

If you do RAG, your documents are not “supporting info.” They are an input channel. That makes the biggest prompt-injection risk not the user. It is your documents.

Microsoft’s Prompt Shields covers user prompt attacks (scanned at the user input intervention point) and document attacks (scanned at the user input and tool response intervention points). When enabled, each request returns annotation results with detected and filtered values that your runtime can translate into a policy decision: block, degrade, or allow.

Provenance Collapse.

Most teams concatenate prompt + policy + retrieved chunks into one blob. The moment you do that, you lose the one thing you need for a defensible boundary: you can no longer reliably tell which tokens came from where. That is how “context” becomes “authority.”

For indirect/document attacks,

Microsoft guidance recommends delimiting context documents inside the prompt using """<documents> ... </documents>""" to improve indirect attack detection.

That delimiter is not formatting. It is a provenance marker that improves indirect attack detection through Prompt Shields.

Minimal, practical pattern:

// Provenance-preserving prompt construction for indirect/document attack detection function buildPrompt(system: string, user: string, retrievedDocs: string[]): string { const docs = retrievedDocs.map((d) => `- ${d}`).join("\n"); return [ system, "", `User: ${user}`, "", `""" <documents>\n${docs}\n</documents> """`, ].join("\n"); }

Then treat Prompt Shields output as a session security event, not a banner:

type RiskState = "NORMAL" | "SUSPECT" | "BLOCK"; type FilterPolicy = "BLOCK_ON_FILTERED" | "DEGRADE_ON_FILTERED"; function computeRiskState( shields: { detected: boolean; filtered?: boolean }, labels: string[], policy: FilterPolicy = "DEGRADE_ON_FILTERED", ): RiskState { // detected => hard stop if (shields.detected) return "BLOCK"; // filtered is an annotation signal: block or degrade by policy if (shields.filtered) { return policy === "BLOCK_ON_FILTERED" ? "BLOCK" : "SUSPECT"; } // example: sensitivity-based degradation independent of shield hits const sensitive = labels.some((l) => ["Confidential", "HighlyConfidential", "Regulated"].includes(l), ); return sensitive ? "SUSPECT" : "NORMAL"; }

When the signal is clear, you block and log. When it is suspicious, you do not warn. You downgrade authority.

QSAF Alignment:

Prompt Injection Protection (Domain 1):

QSAF-PI-001 (static pattern blacklist), QSAF-PI-002 (dynamic LLM analysis), QSAF-PI-003 (semantic embedding comparison)

All addressed by Prompt Shields and provenance marking.

Context Manipulation (Domain 2): QSAF-RC-004 (context drift), QSAF-RC-007 (nested prompt injection) – mitigated by stateful risk calculation.

2. Tools are capabilities with constraints, not functions

When the model proposes a tool call, your runtime should re-derive what is allowed from identity plus risk state, then enforce it at the gateway.

type ToolRequest = { tool: string; args: unknown; }; type Capabilities = { allowWrite: boolean; allowedTools: Set<string>; }; function deriveCapabilities(risk: RiskState, roles: string[]): Capabilities { const baseAllowed = new Set(["search_kb", "get_profile", "summarize"]); const isAdmin = roles.includes("Admin"); if (risk === "SUSPECT") { return { allowWrite: false, allowedTools: baseAllowed }; } if (risk === "BLOCK") { return { allowWrite: false, allowedTools: new Set() }; } // NORMAL const tools = new Set([ ...baseAllowed, ...(isAdmin ? ["update_record", "issue_refund"] : []), ]); return { allowWrite: isAdmin, allowedTools: tools }; } function authorizeTool(req: ToolRequest, caps: Capabilities): void { if (!caps.allowedTools.has(req.tool)) throw new Error("ToolNotAllowed"); if (!caps.allowWrite && req.tool.startsWith("update_")) { throw new Error("WriteDenied"); } }

The model can ask. It cannot grant itself permission.

QSAF Alignment:

Plugin Abuse Monitoring (Domain 3):

QSAF-PL-001 (whitelist enforcement), QSAF-PL-003 (restrict sensitive plugins), QSAF-PL-006 (rate‑limiting) – implemented via capability derivation and gateway policies.

Behavioral Anomaly Detection (Domain 5):

QSAF-BA-006 (plugin execution pattern deviance) – detected by comparing actual calls against derived capabilities.

The Integrity Gate: Hash-chain the authority, not the output

Let me add the part that makes investigations clean.

Most teams treat integrity like an audit log problem. That is not enough. Logs explain. Integrity proves.

The hard truth is that agent authority is assembled out of pieces: the system instruction, the user prompt, retrieved chunks, risk annotations, and finally the tool intent. If you do not bind those pieces together cryptographically, an incident review becomes a story-telling session.

This is why QSAF has an entire domain for payload integrity and signing, including prompt hash signing, nonce or replay protection, and a hash chain lineage that tracks how a session evolved.

Here is how you can map that into the runtime verifies.

You build a canonical “authority envelope” for every privileged hop, compute a digest, and then:

link it to the previous hop (hash chain)
include a nonce (replay control)
sign the digest with Azure Key Vault (Key Vault signs digests, it does not hash your content for you)

import crypto from "crypto"; type AuthorityEnvelope = { sessionId: string; turnId: number; policyVersion: string; // provenance-preserved components systemHash: string; userHash: string; documentsHash: string; // hash of structured retrieved chunks (not just rendered text) shields: { detected: boolean; filtered: boolean; }; riskState: "NORMAL" | "SUSPECT" | "BLOCK"; // proposed action (if any) tool?: { name: string; argsHash: string; }; // anti-replay + lineage nonce: string; prevDigest?: string; ts: string; }; function sha256(bytes: string): string { return crypto.createHash("sha256").update(bytes).digest("hex"); } // Canonicalization matters. JSON.stringify is OK if you control key order. // For cross-language, use RFC 8785 (JCS) canonical JSON. function canonicalJson(x: unknown): string { return JSON.stringify(x); } function buildEnvelope( input: Omit<AuthorityEnvelope, "nonce" | "ts">, ): AuthorityEnvelope { return { ...input, nonce: crypto.randomUUID(), ts: new Date().toISOString(), }; } function digestEnvelope(env: AuthorityEnvelope): string { return sha256(canonicalJson(env)); }

Then you call Key Vault to sign that digest (REST sign), and optionally verify later (REST verify).

The rare failure mode this blocks is subtle: authority splicing.

Without a hash chain, it is possible for the runtime to correctly validate a tool call, but later be unable to prove which retrieved chunk, which Prompt Shields result, and which policy version were in force when that call was authorized. With the chain, every privileged hop becomes tamper-evident.

This is the point: Prompt Shields tells you “this looks dangerous.” Document delimiters preserve provenance.
The integrity gate makes the runtime able to say, later, with evidence: “This is exactly what I accepted as authority.”

QSAF Alignment:

Payload Integrity & Signing (Domain 6): QSAF-PY-001 (prompt hash signing), QSAF-PY-005 (nonce/replay control), QSAF-PY-006 (hash chain lineage) – directly implemented via the envelope and chaining.

Tools must sit behind a wall that can say “no”

Tool calls are where language becomes authority. If an agent can call APIs that mutate state, your security story is not about the response text. It is about whether the tool call is allowed under explicit policy.

This is exactly where Azure API Management belongs: as the tool gateway that enforces authentication and authorization before any tool request reaches your backend. The validate-jwt policy is the canonical enforcement mechanism for validating JWTs at the gateway.

The design goal is simple:

The model can request a tool call. The gateway decides if it is permitted.

A capability token approach keeps it clean:

<validate-jwt header-name="Authorization" failed-validation-httpcode="401"> <required-claims> <claim name="scp"> <value>tools.read</value> </claim> </required-claims> </validate-jwt>

The claim name (scp, roles, or custom claims) depends on your token issuer; the point is enforcing authorization at the gateway, not inside model text.

Now you can enforce “read-only mode” by issuing tokens that simply do not carry write scopes. The model can try to call a write tool. It still gets denied by policy.

Evidence is not logs. Evidence is a signed chain.

Logs help you debug. Evidence helps you prove.

So you hash the session envelope and the tool intent, then sign the digest using Azure Key Vault Keys.

Key Vault sign creates a signature from a digest, and verify verifies a signature against a digest. Key Vault does not hash your content for you. Hash locally, then sign the digest.), and Key Vault documentation is explicit that signing is sign-hash, not “sign arbitrary content.” You hash locally, then ask Key Vault to sign the hash.

import crypto from "crypto"; const sha256 = (x: unknown): string => crypto.createHash("sha256").update(JSON.stringify(x)).digest("hex"); type IntentEnvelope = { sessionId: string; userId: string; promptHash: string; documentsHash: string; tool: string; argsHash: string; nonce: string; ts: string; policyVersion: string; }; function buildIntent( sessionId: string, userId: string, prompt: string, docs: unknown, tool: string, args: unknown, policyVersion: string, ): IntentEnvelope { return { sessionId, userId, promptHash: sha256(prompt), documentsHash: sha256(docs), tool, argsHash: sha256(args), nonce: crypto.randomUUID(), ts: new Date().toISOString(), policyVersion, }; }

Once you do this, your system stops “explaining.” It starts proving.

Govern what the agent can see, not only what it can say

RAG without governance eventually becomes a data exposure feature.

This is why I treat retrieval as a governed operation. Microsoft Purview sensitivity labels give you a practical way to classify content and build retrieval rules on top of that classification. Microsoft documents creating and configuring sensitivity labels in Purview.

The pattern is simple:

Label the corpus.
Filter retrieval by label and identity policy.
Log label distribution per completion.
Alert when a low-privilege identity retrieves high-sensitivity labels.

This is how you keep sovereignty real. Not in a slide deck. In the retrieval path.

Operate it like a security system: posture and detection

Inline gates reduce risk. They do not eliminate it. Systems drift. People add tools. Policies get loosened. Attacks evolve.

Microsoft Defender for Cloud’s Defender CSPM plan includes AI security posture management for generative AI apps and AI agents (Preview), including discovery/inventory of AI agents deployed with Azure AI Foundry.

Then you use Microsoft Sentinel to turn your telemetry into incidents, with scheduled analytics rules.

Your detections should match the gates you built:

Repeated Prompt Shields detections from the same identity or session.
Tool-call spikes after a suspicious document signal.
APIM denials for write endpoints from sessions in read-only mode.
High-sensitivity label retrieval by identities that should never touch that tier.

QSAF Alignment:

Behavioral Anomaly Detection (Domain 5):

QSAF-BA-001 (session entropy), QSAF-BA-004 (repeated intent mutation), QSAF-BA-007 (unified risk score) – detected via Sentinel rules.

Cross‑Environment Defense (Domain 9): QSAF-CE-006 (coordinated alert response) – using Sentinel incidents and playbooks.

Where the reference checklist fits, quietly

Behind the scenes, we use a control checklist lens to ensure we cover prompt/context attacks, tool misuse, integrity, governance, and operational monitoring. The point is not to rename Microsoft features into framework terms. The point is to make the system enforceable and auditable using Azure-native gates.

Closing

Zero trust for agents is not a slogan. It is a build.

Prompt Shields gives you a front gate for both user prompt attacks and document attacks, with clear annotations like detected and filtered.
API Management gives you a tool boundary that can say “no” regardless of what the model tries, using validate-jwt.
Signed intent gives you evidence, using Key Vault’s sign-hash semantics.
Purview labels give you governed retrieval. Sentinel and Defender give you an operating model, not wishful thinking.

If you want the conceptual spine and the architectural principles that frame this pipeline, start with my earlier Tech Community pieces, then come back here and implement the gates.

Thanks for reading

— Hazem Ali

Prompt Engineering for Spec-Driven Development with SpecKit

charykn — Mon, 20 Apr 2026 09:14:37 GMT

Introduction

Charlotte Yeo, UCL MEng Computer Science https://www.linkedin.com/in/charlotte-yeo-627476294/

Supervisors: Janaina Mourao-Miranda (UCL) and Lee Stott (Microsoft).

For my final-year MEng project at UCL, I investigated how to get the best results out of SpecKit, a spec-driven AI development framework, by systematically testing different prompt strategies.

Here's what I found.

Project Overview

LLMs are powerful coding assistants, but they struggle to maintain context over long development sessions, leading to hallucinations and inconsistent outputs. SpecKit addresses this by using persistent, structured specification documents as memory throughout the development process. The developer writes a natural language spec; SpecKit builds the software from it.

The problem is that no one has established best practices for writing those specs. This project aimed to fill that gap.

Experiments

I ran 10 experiments, each using SpecKit to build the same target system, a multi-agent AI code verification tool, from a different prompt formulation. The variables I tested included prompt authority, format, level of detail, and output format. By keeping the target software constant, the effect of each prompt change on SpecKit's performance is isolated.

The target system itself used Microsoft Agent Framework, Azure Cosmos DB for RAG, and Microsoft Foundry to access GPT-5.2, all orchestrated via a Python codebase. This covered a wide range of real-world engineering challenges: multi-agent coordination, cloud service integration, and working with a library new enough that the model hadn't been trained on it.

Technical Details

SpecKit runs as a series of commands inside GitHub Copilot in VS Code, powered here by Claude Sonnet 4.5. The workflow moves through seven stages: /constitution → /specify → /clarify → /plan → /tasks → /analyze → /implement. At each stage, SpecKit writes and updates Markdown files that serve as persistent memory, so the session can be paused and resumed without losing context.

Key tools used:

Microsoft Agent Framework — agent orchestration
Microsoft Foundry — access to LLMs (GPT-5.2, Text Embedding 3)
Azure Cosmos DB — code example database for RAG
Claude Sonnet 4.5 — model powering SpecKit via GitHub Copilot

Results

These were the key findings:

Natural language outperforms machine-readable formats. The JSON prompt (Case 1) took 40% longer and generated significantly more issues than the natural language control.
Authority is necessary. Removing the authoritative framing from the prompt (Case 3) caused SpecKit to treat specifications as optional, resulting in the multi-agent system not being built at all until manually corrected. Total time: 4h 53m vs. 2h 24m for the control.
Omit what the model already knows. Removing the scoring rubrics (Case 8) saved 34 minutes with no loss in output quality as the model inferred the rubric from context. However, omitting the Cosmos DB schema or agent architecture descriptions caused major implementation errors.
The model must be able to read its own outputs. Changing the output to PDF (Case 9), which Claude Sonnet 4.5 cannot read in Copilot, caused the implementation stage to increase significantly to 7h 38m, with 33 required interventions, because the model couldn't verify whether its code was working.

Best Practices Found

The biggest insight is that prompt design has as much impact on SpecKit's performance as prompt content. A complete specification written non-authoritatively or in JSON will produce worse results than a slightly shorter specification written in clear, authoritative natural language.

There is also a trade-off between token count and manual intervention. Shorter prompts are faster, but only when the omitted information is something the model can reliably infer. Leaving out details about unique libraries or architectures will result in higher debugging times later.

Future Development

These are directions for future work in this area:

Running each experiment multiple times to account for model non-determinism
Repeating experiments with newer or different LLMs to test generalisability
Testing with different target systems beyond code verification
Supplying SpecKit with tools (e.g. Playwright MCP) to read outputs it currently cannot access, like live webpages or PDFs

Conclusion

Spec-driven development with SpecKit is a useful approach for building complex software with LLMs, but the quality of your prompt determines the quality of your outcome. For the most effective results, write in natural language, keep the whole prompt authoritative, include detail on novel or library-specific components, design your system's outputs to be readable by the model building them, and leave out only what the model can confidently infer.

If you want to explore the tools used in this project, here are some useful starting points:

Minecraft Education Lesson Plans in Teach: AI-powered lesson planning meets the world of Minecraft

MaxFritz — Thu, 23 Apr 2026 17:57:37 GMT

As educators, you've told us that some of your most time-consuming work is adapting lessons for engagement, aligning them to standards, and finding ways to bring immersive experiences into your curriculum. At the same time, Minecraft Education is already one of the most effective learning tools for engaging learners in classrooms around the world, with students lighting up the moment they hear the word "Minecraft."

Today, we're bringing those two things together. Minecraft Education lesson plans are now generally available in Teach.

Describe your topic, pick a grade level and subject, and Teach generates a complete, standards-aligned lesson plan built around Minecraft Education activities, including the specific blocks, materials, and preparation steps you need to run it confidently, even if you've never opened Minecraft Education before. (Minecraft Education is included in most Microsoft 365 software subscriptions for schools, so you also likely have full access.)

What you get

Every generated Minecraft Education lesson plan includes:

Standards-aligned Minecraft Education activities - Build activities and challenges that reflect your selected standards across subjects like ELA, math, science, social studies, computer science, and more
Minecraft-specific materials guidance - Recommendations for the exact blocks, items, and in-game tools your students will need, so you don't have to figure it out yourself
Preparation instructions - Step-by-step setup guidance for educators new to Minecraft Education, so you can walk into the classroom ready to go
Differentiation and collaboration - Tiered challenge options, collaborative build tasks, and formative checks embedded within gameplay
A student link - A shareable link to send directly to students so they can join the activity

See it in action

Once your lesson is generated, you can edit any section directly or use Enhance with AI to refine it further: add collaborative build tasks, adjust the length and tone, include accessibility supports, or regenerate with new instructions. When it's ready, save to OneDrive and open it in Word to share with colleagues, or launch the Minecraft Education app directly to set up the lesson experience.

For a full walkthrough of every step, see the support article.

Why this matters

We know many of you already love using Minecraft Education in your classrooms, while others are curious how Minecraft can enhance your teaching to deepen student learning and engagement. Minecraft Education lesson plans in Teach make it easier to create experiences by generating a complete, customized lesson from your topic and standards, with the Minecraft-specific materials, activities, and preparation guidance built in.

Whether you're looking for a fresh lesson idea in a subject you haven't tried with Minecraft Education before, or you want to quickly adapt a concept for a different grade level, this tool gives you a starting point you can make your own. You bring the teaching expertise and your knowledge of your students.

Get started

Try it now: Minecraft Education lesson plan
Available to Faculty/Staff with a Microsoft 365 for Education license and Copilot Chat enabled
Does not require a paid Microsoft 365 Copilot license
Minecraft Education may already be included with your Microsoft 365 license or can be purchased separately. Check your licensing options.

Helpful Links

Teach module training on Microsoft Learn, now including training on Minecraft Education lesson generation
Training courses for Minecraft educators

Have questions or ideas? Drop them in the comments below - We'd love to hear how you plan to use Minecraft Education lesson plans in your classroom!

Share your feedback with us by joining our EDU Insider Program (aka.ms/joinEIP).

Until next time,

Max Fritz · Microsoft Education

Introducing the 2026 Imagine Cup Top Launch Startup

StudentDeveloperTeam — Fri, 10 Apr 2026 17:24:08 GMT

Early momentum. Clear direction.

The Launch path highlights student founders who are at an earlier stage but already showing strong signals in how they are approaching what they are building.

L-Guard Ltd. stood out for how clearly the problem was defined, how intentionally the solution is taking shape, and the direction it is heading next.

As the Top Launch Startup, L-Guard Ltd. receives $50,000 USD along with continued visibility and support from Microsoft as they move their solution forward.

Meet the startup

L-Guard Ltd.: AI-powered road safety, built for real-time response

Rwanda

L-Guard Ltd. is addressing a critical gap in road safety across Africa, where many accident victims lose their lives not from the crash itself, but from delayed emergency response.

The startup has built an AI- and IoT-powered system that monitors vehicle activity, detects crashes in real time, and automatically alerts nearby hospitals and emergency responders. By combining sensor data with machine learning models on Azure, L-Guard transforms real-time vehicle signals into actionable emergency intelligence.

This shifts road safety from reactive response to proactive intervention, issuing risky driving warnings, detecting incidents as they happen, and ensuring that help is activated as quickly as possible, even in low-connectivity environments.

As the startup continues to move from pilot validation toward broader deployment, the focus is on strengthening reliability, expanding partnerships, and scaling across high-risk transport markets. By making timely rescue the standard, L-Guard is working to reduce preventable fatalities and bring more accountability to emergency response systems.

Helen Ugoeze Okereke – Growing up in Ebonyi State, Nigeria, Helen set out to become what she called a “computer wizard,” focused on building real solutions with technology. Today, she leads L-Guard’s vision and strategy, driven by a mission to use technology to save lives.

Ramadhani Wanjenja – With a background in embedded systems and intelligent hardware, Ramadhani leads the technical architecture of L-Guard. His personal experience surviving a motorcycle accident shaped the direction of the solution and its focus on immediate response.

Terry Manzi – Raised in Kigali, Terry brings a systems and operations mindset, leading software-hardware integration, deployment, and partnerships to ensure L-Guard works effectively in real environments.

Erioluwa Olowoyo – With a focus on product design and user experience, Erioluwa ensures L-Guard remains intuitive and accessible. His path into technology was self-driven, shaped by a commitment to building solutions that work for real users in real contexts.

What this represents

The Top Launch startup reflects what it means to build with intention from the start.

This is not about having everything finished. It is about identifying a real problem, building toward a solution, and continuing to move forward with clarity and purpose.

As L-Guard Ltd. continues to develop, their work highlights the impact student founders can have when they combine technical skill with lived experience and a clear mission.

Partner tools behind the build

Alongside mentorship and community, Imagine Cup startups gain access to tools that support how their solutions continue to take shape.

Through GitHub Education, teams use the Student Developer Pack, collaborate with AI-assisted coding through Copilot, and build on a platform used by developers around the world.

With Replit, teams build, test, and deploy using natural language in an AI-powered environment designed for rapid iteration.

Together, these tools give startups the flexibility and support to keep moving forward as they scale their solutions.

Introducing the 2026 Imagine Cup World Finalists

StudentDeveloperTeam — Fri, 10 Apr 2026 15:36:18 GMT

Three startups advancing. One global stage ahead.

The defining difference this year was how these startups built their solutions—not just what they built

Across the semifinals, founders demonstrated a clear understanding of the problems they are solving, how Microsoft AI strengthened their solutions, and where they can go next. This was not an early exploration. This was focused execution.

The level of clarity, depth, and progress across all semifinalists set a new standard.

From that group, three startups now move forward to the Imagine Cup World Championship.

These finalists reflect where the Imagine Cup is headed. Student founders are building with real users in mind, thinking beyond prototypes, and developing solutions designed to scale.

Meet the startups

Listed in alphabetical order:

CopyFlag: AI-powered creator protection at scale

United Kingdom

CopyFlag is addressing a growing challenge in the generative AI era, where original work can be copied, modified, and redistributed at scale, often without creators knowing it is happening.

The startup has built an Azure AI-powered platform that detects both direct copies and AI-modified versions of designs across the internet and automatically initiates takedowns. This transforms what has traditionally been a manual and expensive process into something creators can actually use, giving them a way to protect their work without requiring significant legal or technical resources.

Early results show clear demand, with thousands of creators already testing the platform and tens of thousands of infringements identified across marketplaces. By making intellectual property protection more accessible, CopyFlag is helping level the playing field so creators can continue building and growing with confidence.

Patrick Brown – A final-year Biochemistry student, Patrick combines a background in computer vision with hands-on experience building online businesses. After experiencing copyright infringement firsthand, he set out to build CopyFlag, focused on giving creators and small businesses the tools to detect and protect their work at scale.

Revora Health: AI-powered recovery, built for real life

United States

Revora Health is addressing a critical gap in how patients access and experience recovery, where long wait times and limited support often leave individuals navigating rehabilitation on their own.

The startup has built a recovery marketplace paired with an Azure-powered AI agent that provides 24/7 triage and real-time movement feedback. Using computer vision and multimodal models, Revora enables patients to perform rehabilitation exercises correctly while receiving personalized, contextual guidance throughout their recovery journey. This shifts what has traditionally been a slow, reactive process into something continuous and accessible, giving patients more agency while enabling providers to extend care beyond scheduled sessions.

Early results show strong engagement, with active users already participating in a growing private beta as the startup works to scale its marketplace model. By combining accessible care with intelligent, real-time support, Revora Health is helping patients recover with greater confidence while creating a more scalable and effective model for physical therapy.

Surya Kukkapalli – An MBA student, Surya brings together a background in software engineering and firsthand experience as a personal trainer. After seeing the challenges patients face navigating recovery, he set out to build Revora Health to make specialized care more accessible and to give patients the tools and support they need to recover with confidence.

SpoilSafe: AI-powered freshness intelligence for the cold chain

United States

SpoilSafe is addressing a critical gap in the cold chain, where limited visibility into food freshness leads to waste, rejected inventory, and lost revenue.

The startup has built a food freshness intelligence platform that uses low-cost sensors to detect gases emitted as food begins to spoil, such as ethylene and ammonia, and combines that data with machine learning models to generate real-time freshness scores and time-to-spoilage predictions. This shifts cold chain management from reactive monitoring to proactive decision-making, giving operators clear insight into what inventory is at risk and what actions to take.

By moving beyond traditional temperature and humidity tracking, SpoilSafe enables earlier intervention, helping reduce waste while improving operational efficiency across warehouses, distributors, and retailers.

As the startup continues to develop its MVP and expand pilot programs, the focus is on validating performance across product categories and building a scalable deployment model. By making food spoilage predictable instead of inevitable, SpoilSafe is helping create a more efficient and resilient food supply chain.

Advika Vuppala – A hands-on builder with experience across robotics and independent research, Advika brings a practical, problem-solving mindset to the startup. She is also committed to expanding access to engineering, leading workshops and initiatives that have engaged thousands of women in tech.

Rohan Ganesh – A self-taught builder, Rohan developed his skills by experimenting with new technologies and learning by doing. He brings adaptability and speed to product development, iterating quickly while keeping the larger system in focus.

Troy McBride – With a strong foundation in math and analytical thinking, Troy approaches challenges with structure and precision. He focuses on breaking down complex systems into clear, solvable components, ensuring the startup’s work is both ambitious and technically sound.

Vivaan Sawant – Driven by curiosity and discipline, Vivaan focuses on building systems that balance performance with real-world impact. He brings a mindset of continuous improvement, helping shape solutions that are designed to scale and hold up in real environments.

What’s Next

From here, finalists continue building. Refining their product. Strengthening how they communicate the value of what they have created.

Through ongoing mentorship, they will work closely with experienced founders, engineers, and industry leaders to sharpen both their technology and their positioning ahead of the global stage.

At the World Championship, one team will be named the 2026 Imagine Cup World Champion, receiving $100,000 USD, a mentorship session with Microsoft Chairman and CEO Satya Nadella, and opportunities for deeper partnership with Microsoft for Startups to continue building and scaling what comes next.

The World Championship winners will be announced at Microsoft Build on June 2nd. Join us on LinkedIn,  Instagram, X or Facebook, as we follow their journey to the championship.

Partner tools behind the build

Alongside mentorship and community, Imagine Cup startups gain access to tools that support how their solutions continue to take shape.

Through GitHub Education, teams use the Student Developer Pack, collaborate with AI-assisted coding through Copilot, and build on a platform used by developers around the world.

With Replit, teams build, test, and deploy using natural language in an AI-powered environment designed for rapid iteration.

Together, these tools give startups the flexibility and support to keep moving forward as they scale their solutions.

Classic LTI App Retirements, Preview of OneDrive LTI Migration Tool for Canvas

tjvering — Wed, 08 Apr 2026 13:30:00 GMT

Classic Microsoft LTI® Apps Retiring in 2026: What You Need to Know and How to Prepare

Microsoft is continuing its investment in a unified, modern Microsoft 365 LTI experience. As part of this evolution, several classic Microsoft LTI apps will be retired in September 2026.

This post outlines:

Which classic LTI apps are retiring and when
What happens to existing course links and content created in classic LTIs retiring
What actions you should take now to prepare, and start transitioning to Microsoft 365 LTI
New migration tooling available to support transition

Classic Microsoft LTI® Apps Retiring September 17, 2026

As we shared last September in our Microsoft 365 LTI GA release Blog, the following classic Microsoft LTI apps will be retired on September 17, 2026:

Microsoft OneDrive LTI (1.3)
OneNote Class Notebook LTI (1.1)
Microsoft Reflect LTI (1.3)
Microsoft Teams Assignments LTI (1.3)

After September 17, 2026, any links or placements of these classic apps in courses will stop working. However, the files, notebooks, assignments, and check-ins created by these classic apps will continue to be available to copy and reuse.

Replacements for these classic experiences are now available through the unified Microsoft 365 LTI built on the LTI® 1.3 Advantage standard. This delivers modern security, simplified identity mapping with Microsoft Entra, LMS enrollment and grade syncing, and a single deployment model for LMS administrators. We’ll continue to update our migration guides as additional tools and guidance become available.

NEW: Preview the OneDrive LTI Migration Tool for Canvas

Canvas LMS Customers: We are excited to announce that the Microsoft OneDrive LTI Migration Tool for Canvas is now available in Preview!
This tool helps institutions using Canvas LMS migrate OneDrive content links from the classic Microsoft OneDrive LTI app to the new Microsoft 365 LTI app — preserving existing file links in courses so educators and students experience a seamless transition.

For new preview deployments: detailed deployment instructions are available in the Canvas migration guide, which has been updated with configuration steps and guidance for using the migration tool.

If you participated in the private preview: If you have already deployed the OneDrive LTI Migration Tool in Canvas during the private preview, no action is required. Your existing deployment will continue to work as part of the Public Preview, and in GA. If you deployed the private preview in a testing environment, we suggest that you follow the new Canvas migration guide in your production environment.

Below is guidance to assist with transition from the other classic LTI apps and on additional LMS platforms. We will continue to communicate updates to this guidance as it evolves.

If you use the classic Microsoft OneDrive LTI 1.3 with an LMS other than Canvas

Deploy Microsoft 365 LTI with the OneDrive app enabled and guide educators to use the new Microsoft 365 LTI (Microsoft Education menus) to create file links or embeds in course content.
Disable/hide/remove placements of the classic Microsoft OneDrive LTI app in your LMS but do not uninstall or disable the app.
Files linked or embedded with the classic Microsoft OneDrive LTI will stop working when the app is retired, so those links and embeds must be replaced using the new Microsoft 365 LTI (Microsoft Education) app ahead of the retirement date.

OneNote Class Notebook LTI 1.1 (All LMS platforms)

The new OneNote Class Notebook LTI 1.3 integration is now available in the Microsoft 365 LTI app, with automatic roster sync and streamlined setup.

Deploy Microsoft 365 LTI with the OneNote Class Notebook app enabled, and guide educators to use the new app.
Disable/hide/remove placements of the classic OneNote integration, but do not uninstall the app to avoid migration issues during transition.
While there is no direct migration path from OneNote Class Notebook LTI 1.1 notebooks to Microsoft 365 LTI Class Notebooks, educators can copy sections/pages from one notebook to another using the right-click menu on Sections and Pages (and selecting “Move/Copy”) in OneNote on Windows, OneNote Web, and OneNote for Mac. Instructions are also available for content transfer using OneNote on Mac, iOS, or Android.

Microsoft Teams Assignments LTI 1.3 (All LMS platforms)

Deploy Microsoft 365 LTI with the Assignments app enabled, and guide educators to create assignments using the new app.
Disable/hide/remove placements of the legacy Teams Assignments LTI app as soon as you install the new Microsoft 365 LTI and enable the Assignments app, and guide you users to copy their existing assignments using the new app.
Teams Assignments created by the classic LTI 1.3 app can be reused as in the new Microsoft 365 LTI Assignments experience (which does not require a Team)
Assignments created in the LMS or via the Assignments app in Microsoft Teams can be copied and reused using the Create from Existing functionality in the Microsoft 365 LTI (Microsoft Education) Assignment instructor flow.

Microsoft Reflect LTI 1.3 (All LMS platforms)

Deploy Microsoft 365 LTI with the Reflect app enabled, and guide educators to create new Reflects in the new Microsoft 365 LTI experience.
There is no migration path for reflects created in the classic Reflect LTI 1.3 app to the Reflect experience in the new Microsoft 365 LTI Reflect app.
We recommend transitioning to the new Reflect experience in Microsoft 365 LTI as soon as possible, and remove the classic app ahead of the September 17, 2026 retirement.

Stay Connected

We love hearing from you! There are a few ways to stay engaged with Microsoft and your peers on the LMS integrations.

Follow this blog! Click Register at the top right to create an account and profile for the Microsoft Tech Community and Follow the Education Blog so you don’t miss any of our updates.

Join the free Education Insiders Program to preview updates, get support from other community members, meet the team, and influence the roadmap.
Join us for Microsoft 365 LTI office hours to connect with your peers and share feedback directly with Microsoft experts.

When: 1st and 3rd Thursday of each month @ 11AM EST
Where: https://aka.ms/LTIOfficeHours

Getting help and giving feedback

LMS and Microsoft 365 admins can contact Microsoft Education Support to help resolve configuration and deployment issues, for themselves or on behalf of users.
Educators and Learners can contact support or give feedback directly from the app through the help and feedback menu.

TJ Vering
Principal Product Manager
Microsoft Education
https://linkedin.com/in/tvering

Learning Tools Interoperability® (LTI®) is a trademark of the 1EdTech Consortium, Inc. (https://1edtech.org/)

New information literacy features in Search Progress now generally available

EmmaGray — Mon, 06 Apr 2026 22:07:19 GMT

Hello all!

Last September, we shared a preview of new information literacy features coming to Search Progress — designed to help students pause, think critically, and show their reasoning as they research online. Today, we’re excited to share that these features are generally available for all educators using Search Progress through Assignments in Teams for Education and the Microsoft 365 LTI®.

A special thank you to the educators who participated in the preview and shared feedback along the way; your insights helped shape these features into what they are today.

See it in action

Want a walkthrough before reading the details? Watch our Elevate Signature Series session, “Show Me Your Thinking,” where Dr. Geri Gillespy and I discuss future ready skills along with Search Progress setup, the full educator-to-student workflow, and how these skills connect to global assessment frameworks like PISA 2029.

Why process matters more than ever

Information literacy skills like verifying sources, understanding context, and thinking critically are foundational for responsible and effective navigation of online information. These skills become even more critical as AI becomes an integral part of learning and daily life, where students don’t just need access to information, they need to know how to evaluate it.

To ensure these features were developed in alignment with the latest in online reasoning research, we consulted with experts from the Digital Inquiry Group — a team with decades of experience as curriculum designers, classroom educators, researchers, and teacher educators — recognized with awards from UNESCO, the American Educational Research Association, and the School Library Association, to name a few.

What’s now available

The enhanced Search Progress features introduce structured activities and checkpoints — cognitive forcing functions that encourage students to pause, consider, and articulate their reasoning as they navigate the complex world of online information. Here’s what you can now enable for your assignments:

Evaluating source reputability: Instead of relying solely on what a source says about itself, students investigate the individuals or organizations behind the information by looking into what other sources say about them, like how employers use references in a job interview.
Cross-checking and lateral reading: “Using the internet to check the internet”, students compare information and perspectives across multiple sources to reveal patterns, differences, and possible inaccuracies.
Impact awareness: Students consider what could be at risk if the information is inaccurate or fabricated with the new "factual importance" checkpoint. For instance, health advice carries different consequences than an AI-generated image of a cat dancing at the disco.
Identifying source purpose: Information is created for a reason. Students consider who created a source, and whether it’s trying to inform, persuade, sell, or entertain.
Metacognitive reflection: Students reflect on the research process itself including why certain sources stood out, which strategies worked best, and how to apply those learnings in the future.

Not just for research projects

These features aren’t only for formal research assignments. They’re designed for class activities that involve online research, whether students are exploring a new topic, gathering sources for a presentation, or verifying information for a discussion. The goal is to build habits that transfer throughout the digital information ecosystem, from navigating social media to evaluating AI-generated content. For example:

A science educator assigns a pre-lab research task on chemical reactions. By enabling Source Reputation and Factual Importance, students learn to prioritize safety data sheets and academic sources over unverified blogs and to think about why accuracy matters when the stakes are high.
A social studies educator uses Cross-check for an assignment focusing on current events. Students discover that a viral statistic has been reported differently across sources, and they practice tracing claims back to their origin — building lateral reading habits they’ll carry into their media consumption outside of school.

What educators are saying

Teacher librarians, in particular, have told us that the “process over product” approach gives them something they’ve been missing — visibility into the process of student inquiry, not just what they turn in. These features give them a window into the journey, not just the destination. With new scaffolds that support cross-checking and the investigation of source reputation, Search Progress now covers more of the skills they’ve been trying to teach.

We’ve heard from educators that the explanation prompts reveal a side of student thinking that traditional assignments don't often capture. During an early pilot, students pushed back on a text field that didn’t scroll to expand, not because they wanted less writing, but because they had more to say about why they chose their sources and wanted more space to explain their thinking. Students who described themselves as not being strong essay writers found a different way to show their thinking, and when they knew that their reasoning mattered as much as the final product, it changed how they engaged with the assignment.

Preparing students with future-ready skills for the age of AI

As educators worldwide work to build students’ information literacy skills, global frameworks are evolving to match. The OECD recently published a first draft of the PISA 2029 Media and Artificial Intelligence Literacy (MAIL) assessment framework — a new assessment that will measure 15-year-olds’ ability to critically evaluate digital and AI-generated content across all participating countries.

We were interested to see how closely the skills that Search Progress helps build align with the competences this framework describes. The MAIL assessment places significant emphasis on evaluating source credibility, assessing purpose and bias, and cross-checking information across multiple sources — all skills that Search Progress is designed to support through structured activities and checkpoints in the flow of research.

Educators have also shared that these features help address a tension many are navigating right now: how to maintain academic integrity when AI-generated work is increasingly difficult to distinguish from student work. Rather than relying on detection tools at the end of the pipeline, Search Progress makes the research process itself the artifact, which gives educators evidence of student thinking throughout. Of course, information literacy is broader than any single tool. The MAIL framework also includes competences around content creation and collaborative digital participation that go beyond what Search Progress currently addresses. But for the core skill of analysing and evaluating online information — which the framework highlights as one of its most heavily weighted competences — Search Progress can help you give your students meaningful practice right now.

By integrating these research habits into everyday assignments, you’re helping students build skills that will serve them well beyond any single assessment — from navigating social media to evaluating AI-generated content in their daily lives.

Getting started

Open Assignments in Teams for Education (or your LMS via the Microsoft 365 LTI).
Create a new assignment and select Search Progress as a Learning Accelerator.
Choose which information literacy features to enable for this assignment; you can mix and match based on the lesson.
Customize the checkpoint card prompts to fit your subject area and grade level.
Assign it to your class and watch the research process unfold.

Requirements

Available to all Microsoft 365 Education customers
Classes set up in Teams for Education or the Microsoft 365 LTI

Helpful links

📘 Take the MS Learn course — Intro course for educators
📘 Microsoft 365 LTI app overview — Bring Search Progress into your LMS
💬 Join the Education Insiders Program — Share feedback directly with our product team

We’re committed to helping you foster information and AI literacy, and your feedback continues to shape how these tools evolve. Join the Search Progress channel in the Education Insiders Program to connect with other educators, attend community calls, and share your experience directly with the product team. If you’re not yet an EIP member, sign up here: aka.ms/JoinEIP.

Have questions or ideas? Drop them in the comments below. I’d love to hear how you’re using these features in your classroom!

Until next time,

Emma Gray
Product Manager II
Microsoft Education

Learning Tools Interoperability® (LTI®) is a trademark of the 1EdTech Consortium, Inc. (1edtech.org)

Build and Deploy a Microsoft Foundry Hosted Agent: A Hands-On Workshop

Lee_Stott — Fri, 03 Apr 2026 11:25:45 GMT

Agents are easy to demo, hard to ship.

Most teams can put together a convincing prototype quickly. The harder part starts afterwards: shaping deterministic tools, validating behaviour with tests, building a CI path, packaging for deployment, and proving the experience through a user-facing interface. That is where many promising projects slow down.

This workshop helps you close that gap without unnecessary friction. You get a guided path from local run to deployment handoff, then complete the journey with a working chat UI that calls your deployed hosted agent through the project endpoint.

What You Will Build

This is a hands-on, end-to-end learning experience for building and deploying AI agents with Microsoft Foundry.

The lab provides a guided and practical journey through hosted-agent development, including deterministic tool design, prompt-guided workflows, CI validation, deployment preparation, and UI integration.

It’s designed to reduce setup friction with a ready-to-run experience.

It is a prompt-based development lab using Copilot guidance and MCP-assisted workflow options during deployment.

It’s a .NET 10 workshop that includes local development, Copilot-assisted coding, CI, secure deployment to Azure, and a working chat UI.

A local hosted agent that responds on the responses contract
Deterministic tool improvements in core logic with xUnit coverage
A GitHub Actions CI workflow for restore, build, test, and container validation
An Azure-ready deployment path using azd, ACR image publishing, and Foundry manifest apply
A Blazor chat UI that calls openai/v1/responses with agent_reference
A repeatable implementation shape that teams can adapt to real projects

Who This Lab Is For

AI developers and software engineers who prefer learning by building
Motivated beginners who want a guided, step-by-step path
Experienced developers who want a practical hosted-agent reference implementation
Architects evaluating deployment shape, validation strategy, and operational readiness
Technical decision-makers who need to see how demos become deployable systems

Why Hosted Agents

Hosted agents run your code in a managed environment. That matters because it reduces the amount of infrastructure plumbing you need to manage directly, while giving you a clearer path to secure, observable, team-friendly deployments.

Prompt-only demos are still useful. They are quick, excellent for ideation, and often the right place to start. Hosted agents complement that approach when you need custom code, tool-backed logic, and a deployment process that can be repeated by a team.

Think of this lab as the bridge: you keep the speed of prompt-based iteration, then layer in the real-world patterns needed to run reliably.

What You Will Learn

1) Orchestration

You will practise workflow-oriented reasoning through implementation-shape recommendations and multi-step readiness scenarios. The lab introduces orchestration concepts at a practical level, rather than as a dedicated orchestration framework deep dive.

2) Tool Integration

You will connect deterministic tools and understand how tool calls fit into predictable execution paths. This is a core focus of the workshop and is backed by tests in the solution.

3) Retrieval Patterns (What This Lab Covers Today)

This workshop does not include a full RAG implementation with embeddings and vector search. Instead, it focuses on deterministic local tools and hosted-agent response flow, giving you a strong foundation before adding retrieval infrastructure in a follow-on phase.

4) Observability

You will see light observability foundations through OpenTelemetry usage in the host and practical verification during local and deployed checks. This is introductory coverage intended to support debugging and confidence building.

5) Responsible AI

You will apply production-minded safety basics, including secure secret handling and review hygiene. A full Responsible AI policy and evaluation framework is not the primary goal of this workshop, but the workflow does encourage safe habits from the start.

6) Secure Deployment Path

You will move from local implementation to Azure deployment with a secure, practical workflow: azd provisioning, ACR publishing, manifest deployment, hosted-agent start, status checks, and endpoint validation.

The Learning Journey

The overall flow is simple and memorable: clone, open, run, iterate, deploy, observe.

clone -> open -> run -> iterate -> deploy -> observe

You are not expected to memorize every command. The lab is structured to help you learn through small, meaningful wins that build confidence.

Your First 15 Minutes: Quick Wins

Open the repo and understand the lab structure in a few minutes
Set project endpoint and model deployment environment variables
Run the host locally and validate the responses endpoint
Inspect the deterministic tools in WorkshopLab.Core
Run tests and see how behaviour changes are verified
Review the deployment path so local work maps to Azure steps
Understand how the UI validates end-to-end behaviour after deployment
Leave the first session with a working baseline and a clear next step

That first checkpoint is important. Once you see a working loop on your own machine, the rest of the workshop becomes much easier to finish.

Using Copilot and MCP in the Workflow

This lab emphasises prompt-based development patterns that help you move faster while still learning the underlying architecture. You are not only writing code, you are learning to describe intent clearly, inspect generated output, and iterate with discipline.

Copilot supports implementation and review in the coding labs. MCP appears as a practical deployment option for hosted-agent lifecycle actions, provided your tools are authenticated to the correct tenant and project context.

Together, this creates a development rhythm that is especially useful for learning:

Define intent with clear prompts
Generate or adjust implementation details
Validate behaviour through tests and UI checks
Deploy and observe outcomes in Azure
Refine based on evidence, not guesswork

That same rhythm transfers well to real projects. Even if your production environment differs, the patterns from this workshop are adaptable.

Production-Minded Tips

As you complete the lab, keep a production mindset from day one:

Reliability: keep deterministic logic small, testable, and explicit
Security: Treat secrets, identity, and access boundaries as first-class concerns
Observability: use telemetry and status checks to speed up debugging
Governance: keep deployment steps explicit so teams can review and repeat them

You do not need to solve everything in one pass. The goal is to build habits that make your agent projects safer and easier to evolve.

Start Today:

If you have been waiting for the right time to move from “interesting demo” to “practical implementation”, this is the moment. The workshop is structured for self-study, and the steps are designed to keep your momentum high.

Start here: https://github.com/microsoft/Hosted_Agents_Workshop_Lab

Want deeper documentation while you go? These official guides are great companions:

When you finish, share what you built. Post a screenshot or short write-up in a GitHub issue/discussion, on social, or in comments with one lesson learned. Your example can help the next developer get unstuck faster.

Copy/Paste Progress Checklist

[ ] Clone the workshop repo
[ ] Complete local setup and run the agent
[ ] Make one prompt-based behaviour change
[ ] Validate with tests and chat UI
[ ] Run CI checks
[ ] Provision and deploy via Azure and Foundry workflow
[ ] Review observability signals and refine
[ ] Share what I built + one takeaway

Common Questions

How long does it take?

Most developers can complete a meaningful pass in a few focused sessions of 60-75 mins. You can get the first local success quickly, then continue through deployment and refinement at your own pace.

Do I need an Azure subscription?

Yes, for provisioning and deployment steps. You can still begin local development and testing before completing all Azure activities.

Is it beginner-friendly?

Yes. The labs are written for beginners, run in sequence, and include expected outcomes for each stage.

Can I adapt it beyond .NET?

Yes. The implementation in this workshop is .NET 10, but the architecture and development patterns can be adapted to other stacks.

What if I am evaluating for a team?

This lab is a strong team evaluation asset because it demonstrates end-to-end flow: local dev, integration patterns, CI, secure deployment, and operational visibility.

Closing

This workshop gives you more than theory. It gives you a practical path from first local run to deployed hosted agent, backed by tests, CI, and a user-facing UI validation loop. If you want a build-first route into Microsoft Foundry hosted-agent development, this is an excellent place to start.

Begin now: https://github.com/microsoft/Hosted_Agents_Workshop_Lab

Looking for official role-based AI learning paths and Microsoft AI ecosystem diagram

smatsusaki — Thu, 02 Apr 2026 04:10:42 GMT

Hello everyone,

I am responsible for AI up-skilling at my company, and we are currently building role-based learning paths for roles such as AI Engineer, Data Analyst, Data Engineer, and Data Scientist.

I would really appreciate any advice or pointers to official Microsoft resources on the following topics.

Q1. Role-based learning paths

I am aware of the Microsoft Learn career paths:

However, I am looking for the most up-to-date official learning paths or curated guidance that also cover newer services such as:

Copilot

GitHub Copilot

Microsoft Fabric

Azure AI Foundry

Are there any Microsoft resources that organize recommended learning content by role for these newer areas?

Q2. Official Microsoft AI ecosystem diagram

I am also looking for an official Microsoft diagram, map, or architecture overview that shows the overall AI ecosystem, including services such as Copilot, GitHub Copilot, Microsoft Fabric, and Azure AI Foundry.

As a reference, I am aware of unofficial resource, although it appears to be somewhat outdated:

If anyone knows of an official and more recent resource, I would be very grateful.

(If direct links are not allowed in replies, page titles or document names would also be very helpful.)

Thank you.

Imagine Cup 2026 Semifinalist: Builder Series Judges

StudentDeveloperTeam — Thu, 09 Apr 2026 19:03:20 GMT

With submissions closing on January 9, selected startups advance into the semifinals and step into this experience. From meeting their mentors to participating in build labs and pitch clinics, founders sharpen their product, their story, and their readiness for the global stage.

In the semifinals, startups present live and step into the next level of the competition.

They pitch in front of a panel of AI experts, startup founders, and investors, each bringing real-world experience in building, scaling, and backing technology. Through live Q&A and direct feedback, startups gain insight that challenges their thinking, strengthens their approach, and helps move their solution forward.

Meet the semifinals judges (listed in alphabetical order):

Mike Abbott

Mike Abbott is a Partner at Antler, co-leading its Australian operations and backing founders from day zero through scale. With a background in equity capital markets and M&A, he was an early Uber leader in Asia and later Head of Operations for Australia and New Zealand, helping scale the business from a small team to a multi-billion-dollar operation. As cofounder of Kaddy, a B2B marketplace acquired within three years, he brings deep experience in building, scaling, and investing in startups.

Todd Anglin

Todd Anglin is a Partner Developer Relations Lead at Microsoft with proven experience building and scaling high-performing teams. With a background spanning web and mobile development, cloud native platforms, and low code tools, he has led product, developer relations, and go-to-market efforts across growing technology companies. Known for his strength in communication, he brings the ability to translate complex technical concepts for any audience while helping teams move quickly and build with impact.

Rania Awad

Rania Awad is Chief Strategy Officer at Helfie.AI and a strategic leader at the intersection of AI, healthcare, and digital transformation. With experience across SaaS, health tech, and global digital businesses, she has led high-impact initiatives that turn bold ideas into scalable outcomes. Known for her cross-functional leadership and strong commercial lens, she brings a focus on connecting strategy to execution to drive meaningful impact.

Rick Clause

Rick Claus is a Cloud Advocate Team Lead at Microsoft with over 25 years of experience in the IT industry. As part of the Developer Relations Cloud Advocacy team, he focuses on enabling cross-team collaboration and engaging global technical communities around Azure and hybrid cloud solutions. With a background in enterprise architecture, virtualization, and technical training, he brings deep expertise in connecting product, engineering, and technical audiences to improve the overall customer experience.

Sonia Cuff

Sonia Cuff leads the Cloud Native & Linux team inside Microsoft's Developer Relations division, connecting with technical communities worldwide. She has over 30 years’ experience in tech, from large enterprises and government to small businesses and partners. Sonia is passionate about the connection between technology and business.

Mal Filipowska

Małgorzata (Mal) Filipowska is a venture capitalist with nearly a decade of experience investing in early-stage companies across emerging markets. As part of Seedstars International Ventures, a fund backed by global institutions including the World Bank, Rockefeller Foundation, and Visa Foundation, she manages a portfolio of over 130 companies across 40 countries, supporting founders across diverse and high-growth markets. She brings deep insight into scaling startups in these regions and a strong perspective on early-stage growth.

Alexandra Miele

Alexandra Miele leads Platform at HOF Capital, where she drives portfolio engagement and builds strategic partnerships across the firm’s global network. With experience spanning venture, private capital, and institutional investing, she previously served as a Partner at a family office managing a $1B+ portfolio and held leadership roles at Rockefeller Capital Management and Goldman Sachs. She brings deep insight into alternative investments, growth strategy, and supporting companies from early stage through scale.

Nigel Parker

Nigel Parker is a technology leader with over 30 years of experience across cloud, data platforms, machine learning, and AI. Having led global engineering and architecture teams, including serving as Chief Engineer for Microsoft Asia Commercial Software Engineering, he co-founded Vivara, an AI-driven wellbeing platform and works as a Data & AI consultant at Arinco (The Artificial Intelligence Company). He brings deep expertise in building scalable systems, integrating AI, and designing technology with a strong focus on human outcomes.

Sarah Thiam

Sarah Thiam is the founder and CEO of Germina Labs, a AI x Web3 studio focused on developer-facing products, programs and tooling. With a product and developer relations background at Microsoft, Protocol Labs and the Singapore government, she brings a well-rounded perspective to scaling technical ecosystems.

Zhen Li

Zhen Li created Replit Agent and leads the AI team at Replit, building AI agents that turn ideas into real products. With experience building startups and AI agent, He brings deep expertise in developing intelligent tools that accelerate how software is built and shipped.

Up next

The top three startups will advance to the World Championship, where they will compete on the global stage for the title and a $100,000 USD prize, along with a mentorship session with Satya Nadella, Chairman and CEO of Microsoft.

This is where everything comes together, as startups step forward to showcase what they have built and how they are ready to scale.

Follow along on Instagram, LinkedIn, X and Facebook for the latest updates, startups announcements, and moments leading up to the World Championship.

Getting Started with Foundry Local: A Student Guide to the Microsoft Foundry Local Lab

Lee_Stott — Mon, 30 Mar 2026 07:00:00 GMT

If you want to start building AI applications on your own machine, the Microsoft Foundry Local Lab is one of the most useful places to begin. It is a practical workshop that takes you from first-time setup through to agents, retrieval, evaluation, speech transcription, tool calling, and a browser-based interface. The material is hands-on, cross-language, and designed to show how modern AI apps can run locally rather than depending on a cloud service for every step.

This blog post is aimed at students, self-taught developers, and anyone learning how AI applications are put together in practice. Instead of treating large language models as a black box, the lab shows you how to install and manage local models, connect to them with code, structure tasks into workflows, and test whether the results are actually good enough. If you have been looking for a learning path that feels more like building real software and less like copying isolated snippets, this workshop is a strong starting point.

What Is Foundry Local?

Foundry Local is a local runtime for downloading, managing, and serving AI models on your own hardware. It exposes an OpenAI-compatible interface, which means you can work with familiar SDK patterns while keeping execution on your device. For learners, that matters for three reasons. First, it lowers the barrier to experimentation because you can run projects without setting up a cloud account for every test. Second, it helps you understand the moving parts behind AI applications, including model lifecycle, local inference, and application architecture. Third, it encourages privacy-aware development because the examples are designed to keep data on the machine wherever possible.

The Foundry Local Lab uses that local-first approach to teach the full journey from simple prompts to multi-agent systems. It includes examples in Python, JavaScript, and C#, so you can follow the language that fits your course, your existing skills, or the platform you want to build on.

Why This Lab Works Well for Learners

A lot of AI tutorials stop at the moment a model replies to a prompt. That is useful for a first demo, but it does not teach you how to build a proper application. The Foundry Local Lab goes further. It is organised as a sequence of parts, each one adding a new idea and giving you working code to explore. You do not just ask a model to respond. You learn how to manage the service, choose a language SDK, construct retrieval pipelines, build agents, evaluate outputs, and expose the result through a usable interface.

That sequence is especially helpful for students because the parts build on each other. Early labs focus on confidence and setup. Middle labs focus on architecture and patterns. Later labs move into more advanced ideas that are common in real projects, such as tool calling, evaluation, and custom model packaging. By the end, you have seen not just what a local AI app looks like, but how its different layers fit together.

Before You Start

The workshop expects a reasonably modern machine and at least one programming language environment. The core prerequisites are straightforward: install Foundry Local, clone the repository, and choose whether you want to work in Python, JavaScript, or C#. You do not need to master all three. In fact, most learners will get more value by picking one language first, completing the full path in that language, and only then comparing how the same patterns look elsewhere.

If you are new to AI development, do not be put off by the number of parts. The early sections are accessible, and the later ones become much easier once you have completed the foundations. Think of the lab as a structured course rather than a single tutorial.

What You Learn in Each Lab https://github.com/microsoft-foundry/foundry-local-lab

Part 1: Getting Started with Foundry Local

The first part introduces the basics of Foundry Local and gets you up and running. You learn how to install the CLI, inspect the model catalogue, download a model, and run it locally. This part also introduces practical details such as model aliases and dynamic service ports, which are small but important pieces of real development work.

For students, the value of this part is confidence. You prove that local inference works on your machine, you see how the service behaves, and you learn the operational basics before writing any application code. By the end of Part 1, you should understand what Foundry Local does, how to start it, and how local model serving fits into an application workflow.

Part 2: Foundry Local SDK Deep Dive

Once the CLI makes sense, the workshop moves into the SDK. This part explains why application developers often use the SDK instead of relying only on terminal commands. You learn how to manage the service programmatically, browse available models, control model download and loading, and understand model metadata such as aliases and hardware-aware selection.

This is where learners start to move from using a tool to building with a platform. You begin to see the difference between running a model manually and integrating it into software. By the end of this section, you should understand the API surface you will use in your own projects and know how to bootstrap the SDK in Python, JavaScript, or C#.

Part 3: SDKs and APIs

Part 3 turns the SDK concepts into a working chat application. You connect code to the local inference server and use the OpenAI-compatible API for streaming chat completions. The lab includes examples in all three supported languages, which makes it especially useful if you are comparing ecosystems or learning how the same idea is expressed through different syntax and libraries.

The key learning outcome here is not just that you can get a response from a model. It is that you understand the boundary between your application and the local model service. You learn how messages are structured, how streaming works, and how to write the sort of integration code that becomes the foundation for every later lab.

Part 4: Retrieval-Augmented Generation

This is where the workshop starts to feel like modern AI engineering rather than basic prompting. In the retrieval-augmented generation lab, you build a simple RAG pipeline that grounds answers in supplied data. You work with an in-memory knowledge base, apply retrieval logic, score matches, and compose prompts that include grounded context.

For learners, this part is important because it demonstrates a core truth of AI app development: a model on its own is often not enough. Useful applications usually need access to documents, notes, or structured information. By the end of Part 4, you understand why retrieval matters, how to pass retrieved context into a prompt, and how a pipeline can make answers more relevant and reliable.

Part 5: Building AI Agents

Part 5 introduces the concept of an agent. Instead of a one-off prompt and response, you begin to define behaviour through system instructions, roles, and conversation state. The lab uses the ChatAgent pattern and the Microsoft Agent Framework to show how an agent can maintain a purpose, respond with a persona, and return structured output such as JSON.

This part helps learners understand the difference between a raw model call and a reusable application component. You learn how to design instructions that shape behaviour, how multi-turn interaction differs from single prompts, and why structured output matters when an AI component has to work inside a broader system.

Part 6: Multi-Agent Workflows

Once a single agent makes sense, the workshop expands the idea into a multi-agent workflow. The example pipeline uses roles such as researcher, writer, and editor, with outputs passed from one stage to the next. You explore sequential orchestration, shared configuration, and feedback loops between specialised components.

For students, this lab is a very clear introduction to decomposition. Instead of asking one model to do everything at once, you break a task into smaller responsibilities. That pattern is useful well beyond AI. By the end of Part 6, you should understand why teams build multi-agent systems, how hand-offs are structured, and what trade-offs appear when more components are added to a workflow.

Part 7: Zava Creative Writer Capstone Application

The Zava Creative Writer is the capstone project that brings the earlier ideas together into a more production-style application. It uses multiple specialised agents, structured JSON hand-offs, product catalogue search, streaming output, and evaluation-style feedback loops. Rather than showing an isolated feature, this part shows how separate patterns combine into a complete system.

This is one of the most valuable parts of the workshop for learner developers because it narrows the gap between tutorial code and real application design. You can see how orchestration, agent roles, and practical interfaces fit together. By the end of Part 7, you should be able to recognise the architecture of a serious local AI app and understand how the earlier labs support it.

Part 8: Evaluation-Led Development

Many beginner AI projects stop once the output looks good once or twice. This lab teaches a much stronger habit: evaluation-led development. You work with golden datasets, rule-based checks, and LLM-as-judge scoring to compare prompt or agent variants systematically. The goal is to move from anecdotal testing to repeatable assessment.

This matters enormously for students because evaluation is one of the clearest differences between a classroom demo and dependable software. By the end of Part 8, you should understand how to define success criteria, compare outputs at scale, and use evidence rather than intuition when improving an AI component.

Part 9: Voice Transcription with Whisper

Part 9 broadens the workshop beyond text generation by introducing speech-to-text with Whisper running locally. You use the Foundry Local SDK to download and load the model, then transcribe local audio files through the compatible API surface. The emphasis is on privacy-first processing, with audio kept on-device.

This section is a useful reminder that local AI development is not limited to chatbots. Learners see how a different modality fits into the same ecosystem and how local execution supports sensitive workloads. By the end of this lab, you should understand the transcription flow, the relevant client methods, and how speech features can be integrated into broader applications.

Part 10: Using Custom or Hugging Face Models

After learning the standard path, the workshop shows how to work with custom or Hugging Face models. This includes compiling models into optimised ONNX format with ONNX Runtime GenAI, choosing hardware-specific options, applying quantisation strategies, creating configuration files, and adding compiled models to the Foundry Local cache.

For learner developers, this part opens the door to model engineering rather than simple model consumption. You begin to understand that model choice, optimisation, and packaging affect performance and usability. By the end of Part 10, you should have a clearer picture of how models move from an external source into a runnable local setup and why deployment format matters.

Part 11: Tool Calling with Local Models

Tool calling is one of the most practical patterns in current AI development, and this lab covers it directly. You define tool schemas, allow the model to request function calls, handle the multi-turn interaction loop, execute the tools locally, and return results back to the model. The examples include practical scenarios such as weather and population tools.

This lab teaches learners how to move beyond generation into action. A model is no longer limited to producing text. It can decide when external data or a function is needed and incorporate that result into a useful answer. By the end of Part 11, you should understand the tool-calling flow and how AI systems connect reasoning with deterministic software behaviour.

Part 12: Building a Web UI for the Zava Creative Writer

Part 12 adds a browser-based front end to the capstone application. You learn how to serve a shared interface from Python, JavaScript, or C#, stream updates to the browser, consume NDJSON with the Fetch API and ReadableStream, and show live agent status as content is produced in real time.

This part is especially good for students who want to build portfolio projects. It turns backend orchestration into something visible and interactive. By the end of Part 12, you should understand how to connect a local AI backend to a web interface and how streaming changes the user experience compared with waiting for one final response.

Part 13: Workshop Complete

The final part is a summary and extension point. It reviews what you have built across the previous sections and suggests ways to continue. Although it is not a new technical lab in the same way as the earlier parts, it plays an important role in learning. It helps you consolidate the architecture, the terminology, and the development patterns you have encountered.

For learners, reflection matters. By the end of Part 13, you should be able to describe the full stack of a local AI application, from model management to user interface, and identify which area you want to deepen next.

What Students Gain from the Full Workshop

Taken together, these labs do more than teach Foundry Local itself. They teach how AI applications are built. You learn operational basics such as model setup and service management. You learn application integration through SDKs and APIs. You learn system design through RAG, agents, multi-agent orchestration, and web interfaces. You learn engineering discipline through evaluation. You also see how text, speech, custom models, and tool calling all fit into one local-first development workflow.

That breadth makes the workshop useful in several settings. A student can use it as a self-study path. A lecturer can use it as source material for practical sessions. A learner developer can use it to build portfolio pieces and to understand which AI patterns are worth learning next. Because the repository includes Python, JavaScript, and C#, it also works well for comparing how architectural ideas transfer across languages.

How to Approach the Lab as a Beginner

If you are starting from scratch, the best route is simple. Complete Parts 1 to 3 in your preferred language first. That gives you the essential setup and integration skills. Then move into Parts 4 to 6 to understand how AI application patterns are composed. After that, use Parts 7 and 8 to learn how larger systems and evaluation fit together. Finally, explore Parts 9 to 12 based on your interests, whether that is speech, tooling, model customisation, or front-end work.

It is also worth keeping notes as you go. Record what each part adds to your understanding, what code files matter, and what assumptions each example makes. That habit will help you move from following the labs to adapting the patterns in your own projects.

Final Thoughts

The Microsoft Foundry Local Lab is a strong introduction to local AI development because it treats learners like developers rather than spectators. You install, run, connect, orchestrate, evaluate, and present working systems. That makes it far more valuable than a short demo that only proves a model can answer a question.

If you are a student or learner developer who wants to understand how AI applications are really built, this lab gives you a clear path. Start with the basics, pick one language, and work through the parts in order. By the time you finish, you will not just have used Foundry Local. You will have a practical foundation for building local AI applications with far more confidence and much better judgement.

Microsoft Mesh Education Licensing

abrooks1 — Thu, 26 Mar 2026 14:26:32 GMT

Microsoft Education and Product Teams,

I am writing to advocate for the inclusion of Microsoft Mesh (Immersive Spaces and Events) within the Microsoft 365 Education SKU family (A1, A3, and A5).

Currently, Mesh is available across nearly every commercial license family, from Teams Essentials to E5 Enterprise, but is explicitly excluded from Education tenants. As documented in several Learn Q&A threads and service plan manifests, the MESH_IMMERSIVE_FOR_TEAMS service plan is simply not provisioned for EDU customers.

The current state is one of silent exclusion, creating several critical hurdles:

Pedagogical: Immersive technology is one of the most requested features for remote and hybrid learning to combat "Zoom fatigue" and increase student engagement. Education is a high-value use case for 3D immersion.
Parity: Universities and K-12 institutions on A5 licenses pay for "top-tier" features but are denied the innovative tools available to a "Business Basic" user. If you are a small business on a basic plan, you have Mesh. If you are a world-class University on A5, you are blocked. This isn't a "procurable" add-on; it is a licensing eligibility wall.
Implementation: Current Microsoft guidance suggests schools move to Business or Enterprise licensing to access Mesh. This is not a viable solution for institutions with thousands of users, complex compliance requirements, and student-data privacy frameworks built specifically around EDU SKUs.

We aren’t asking for a discount; we are asking for eligibility. We urge the product team to:

Add the Mesh Immersive service plan to the A3 and A5 EDU license entitlements.
Provide a clear roadmap for when Education tenants can expect feature parity with Commercial tenants.

Education should be the vanguard of immersive collaboration, not an afterthought. We would appreciate a formal update on when this licensing barrier will be removed.

Unable to Setup Billing for new tenant account: Error code - 43881

ealawode — Thu, 26 Mar 2026 09:29:54 GMT

I set up a Microsoft 365 Education tenant for a school in Uganda but received error code 43881 during billing verification. The tenant was created but A1 trial licenses were not attached. I have no chat or email support options available in the admin center. Error code: 43881

Langchain Multi-Agent Systems with Microsoft Agent Framework and Hosted Agents

Lee_Stott — Thu, 26 Mar 2026 07:00:00 GMT

If you have been building AI agents with LangChain, you already know how powerful its tool and chain abstractions are. But when it comes to deploying those agents to production — with real infrastructure, managed identity, live web search, and container orchestration — you need something more.

This post walks through how to combine LangChain with the Microsoft Agent Framework (azure-ai-agents) and deploy the result as a Microsoft Foundry Hosted Agent. We will build a multi-agent incident triage copilot that uses LangChain locally and seamlessly upgrades to cloud-hosted capabilities on Microsoft Foundry.

Why combine LangChain with Microsoft Agent Framework?

As a LangChain developer, you get excellent abstractions for building agents: the @tool decorator, RunnableLambda chains, and composable pipelines. But production deployment raises questions that LangChain alone does not answer:

Where do your agents run? Containers, serverless, or managed infrastructure?
How do you add live web search or code execution? Bing Grounding and Code Interpreter are not LangChain built-ins.
How do you handle authentication? Managed identity, API keys, or tokens?
How do you observe agents in production? Distributed tracing across multiple agents?

The Microsoft Agent Framework fills these gaps. It provides AgentsClient for creating and managing agents on Microsoft Foundry, built-in tools like BingGroundingTool and CodeInterpreterTool, and a thread-based conversation model. Combined with Hosted Agents, you get a fully managed container runtime with health probes, auto-scaling, and the OpenAI Responses API protocol.

The key insight: LangChain handles local logic and chain composition; the Microsoft Agent Framework handles cloud-hosted orchestration and tooling.

Architecture overview

The incident triage copilot uses a coordinator pattern with three specialist agents:

User Query
    |
    v
Coordinator Agent
    |
    +--> LangChain Triage Chain    (routing decision)
    +--> LangChain Synthesis Chain  (combine results)
    |
    +---+---+---+
    |   |       |
    v   v       v
Research  Diagnostics  Remediation
 Agent      Agent        Agent

Each specialist agent has two execution modes:

Mode	LangChain Role	Microsoft Agent Framework Role
Local	`@tool` functions provide heuristic analysis	Not used
Foundry	Chains handle routing and synthesis	`AgentsClient` with `BingGroundingTool`, `CodeInterpreterTool`

This dual-mode design means you can develop and test locally with zero cloud dependencies, then deploy to Foundry for production capabilities.

Step 1: Define your LangChain tools

Start with what you know. Define typed, documented tools using LangChain’s @tool decorator:

from langchain_core.tools import tool @tool def classify_incident_severity(query: str) -> str: """Classify the severity and priority of an incident based on keywords. Args: query: The incident description text. Returns: Severity classification with priority level. """ query_lower = query.lower() critical_keywords = [ "production down", "all users", "outage", "breach", ] high_keywords = [ "503", "500", "timeout", "latency", "slow", ] if any(kw in query_lower for kw in critical_keywords): return "severity=critical, priority=P1" if any(kw in query_lower for kw in high_keywords): return "severity=high, priority=P2" return "severity=low, priority=P4"

These tools work identically in local mode and serve as fallbacks when Foundry is unavailable.

Step 2: Build routing with LangChain chains

Use RunnableLambda to create a routing chain that classifies the incident and selects which specialists to invoke:

from langchain_core.runnables import RunnableLambda from enum import Enum class AgentRole(str, Enum): RESEARCH = "research" DIAGNOSTICS = "diagnostics" REMEDIATION = "remediation" DIAGNOSTICS_KEYWORDS = { "log", "error", "exception", "timeout", "500", "503", "crash", "oom", "root cause", } REMEDIATION_KEYWORDS = { "fix", "remediate", "runbook", "rollback", "hotfix", "patch", "resolve", "action plan", } def _route(inputs: dict) -> dict: query = inputs["query"].lower() specialists = [AgentRole.RESEARCH] # always included if any(kw in query for kw in DIAGNOSTICS_KEYWORDS): specialists.append(AgentRole.DIAGNOSTICS) if any(kw in query for kw in REMEDIATION_KEYWORDS): specialists.append(AgentRole.REMEDIATION) return {**inputs, "specialists": specialists} triage_routing_chain = RunnableLambda(_route)

This is pure LangChain — no cloud dependency. The chain analyses the query and returns which specialists should handle it.

Step 3: Create specialist agents with dual-mode execution

Each specialist agent extends a base class. In local mode, it uses LangChain tools. In Foundry mode, it delegates to the Microsoft Agent Framework:

from abc import ABC, abstractmethod from pathlib import Path class BaseSpecialistAgent(ABC): role: AgentRole prompt_file: str def __init__(self): prompt_path = Path(__file__).parent.parent / "prompts" / self.prompt_file self.system_prompt = prompt_path.read_text(encoding="utf-8") async def run(self, query, shared_context, correlation_id, client=None): if client is not None: return await self._run_on_foundry(query, shared_context, correlation_id, client) return await self._run_locally(query, shared_context, correlation_id) async def _run_on_foundry(self, query, shared_context, correlation_id, client): """Use Microsoft Agent Framework for cloud-hosted execution.""" from azure.ai.agents.models import BingGroundingTool agent = await client.agents.create_agent( model=shared_context.get("model_deployment", "gpt-4o"), name=f"{self.role.value}-{correlation_id}", instructions=self.system_prompt, tools=self._get_foundry_tools(shared_context), ) thread = await client.agents.threads.create() await client.agents.messages.create( thread_id=thread.id, role="user", content=self._build_prompt(query, shared_context), ) run = await client.agents.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, ) # Extract and return the agent’s response... async def _run_locally(self, query, shared_context, correlation_id): """Use LangChain tools for local heuristic analysis.""" # Each subclass implements this with its specific tools ...

The key pattern here: same interface, different backends. Your coordinator does not care whether a specialist ran locally or on Foundry.

Step 4: Wire it up with FastAPI

Expose the multi-agent pipeline through a FastAPI endpoint. The /triage endpoint accepts incident descriptions and returns structured reports:

from fastapi import FastAPI from agents.coordinator import Coordinator from models import TriageRequest app = FastAPI(title="Incident Triage Copilot") coordinator = Coordinator() @app.post("/triage") async def triage(request: TriageRequest): return await coordinator.triage( request=request, client=app.state.foundry_client, max_turns=10, )

The application also implements the /responses endpoint, which follows the OpenAI Responses API protocol. This is what Microsoft Foundry Hosted Agents expects when routing traffic to your container.

Step 5: Deploy as a Hosted Agent

This is where Microsoft Foundry Hosted Agents shines. Your multi-agent system becomes a managed, auto-scaling service with a single command:

# Install the azd AI agent extension azd extension install azure.ai.agents # Provision infrastructure and deploy azd up

The Azure Developer CLI (azd) provisions everything:

Azure Container Registry for your Docker image
Container App with health probes and auto-scaling
User-Assigned Managed Identity for secure authentication
Microsoft Foundry Hub and Project with model deployments
Application Insights for distributed tracing

Your agent.yaml defines what tools the hosted agent has access to:

name: incident-triage-copilot-langchain kind: hosted model: deployment: gpt-4o identity: type: managed tools: - type: bing_grounding enabled: true - type: code_interpreter enabled: true

What you gain over pure LangChain

Capability	LangChain Only	LangChain + Microsoft Agent Framework
Local development	Yes	Yes (identical experience)
Live web search	Requires custom integration	Built-in `BingGroundingTool`
Code execution	Requires sandboxing	Built-in `CodeInterpreterTool`
Managed hosting	DIY containers	Foundry Hosted Agents
Authentication	DIY	Managed Identity (zero secrets)
Observability	DIY	OpenTelemetry + Application Insights
One-command deploy	No	`azd up`

Testing locally

The dual-mode architecture means you can test the full pipeline without any cloud resources:

# Create virtual environment and install dependencies python -m venv .venv source .venv/bin/activate pip install -r requirements.txt # Run locally (agents use LangChain tools) python -m src

Then open http://localhost:8080 in your browser to use the built-in web UI, or call the API directly:

curl -X POST http://localhost:8080/triage \ -H "Content-Type: application/json" \ -d '{"message": "Getting 503 errors on /api/orders since 2pm"}'

The response includes a coordinator summary, specialist results with confidence scores, and the tools each agent used.

Running the tests

The project includes a comprehensive test suite covering routing logic, tool behaviour, agent execution, and HTTP endpoints:

curl -X POST http://localhost:8080/triage \ -H "Content-Type: application/json" \ -d '{"message": "Getting 503 errors on /api/orders since 2pm"}'

Tests run entirely in local mode, so no cloud credentials are needed.

Key takeaways for LangChain developers

Keep your LangChain abstractions. The @tool decorator, RunnableLambda chains, and composable pipelines all work exactly as you expect.
Add cloud capabilities incrementally. Start local, then enable Bing Grounding, Code Interpreter, and managed hosting when you are ready.
Use the dual-mode pattern. Every agent should work locally with LangChain tools and on Foundry with the Microsoft Agent Framework. This makes development fast and deployment seamless.
Let azd handle infrastructure. One command provisions everything: containers, identity, monitoring, and model deployments.
Security comes free. Managed Identity means no API keys in your code. Non-root containers, RBAC, and disabled ACR admin are all configured by default.

Get started

Clone the sample repository and try it yourself:

git clone https://github.com/leestott/hosted-agents-langchain-samples cd hosted-agents-langchain-samples python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt python -m src

Open http://localhost:8080 to interact with the copilot through the web UI. When you are ready for production, run azd up and your multi-agent system is live on Microsoft Foundry.

Resources

Build an Offline Hybrid RAG Stack with ONNX and Foundry Local

Lee_Stott — Thu, 26 Mar 2026 07:00:00 GMT

If you are building local AI applications, basic retrieval augmented generation is often only the starting point. This sample shows a more practical pattern: combine lexical retrieval, ONNX based semantic embeddings, and a Foundry Local chat model so the assistant stays grounded, remains offline, and degrades cleanly when the semantic path is unavailable.

Why this sample is worth studying

Many local RAG samples rely on a single retrieval strategy. That is usually enough for a proof of concept, but it breaks down quickly in production. Exact keywords, acronyms, and document codes behave differently from natural language questions and paraphrased requests.

This repository keeps the original lexical retrieval path, adds local ONNX embeddings for semantic search, and fuses both signals in a hybrid ranking mode. The generation step runs through Foundry Local, so the entire assistant can remain on device.

Lexical mode handles exact terms and structured vocabulary.
Semantic mode handles paraphrases and more natural language phrasing.
Hybrid mode combines both and is usually the best default.
Lexical fallback protects the user experience if the embedding pipeline cannot start.

Architectural overview

The sample has two main flows: an offline ingestion pipeline and a local query pipeline.

The architecture splits cleanly into offline ingestion at the top and runtime query handling at the bottom.

Offline ingestion pipeline

Read Markdown files from docs/.
Parse front matter and split each document into overlapping chunks.
Generate dense embeddings when the ONNX model is available.
Store chunks in SQLite with both sparse lexical features and optional dense vectors.

Local query pipeline

The browser posts a question to the Express API.
ChatEngine resolves the requested retrieval mode.
VectorStore retrieves lexical, semantic, or hybrid results.
The prompt is assembled with the retrieved context and sent to a Foundry Local chat model.
The answer is returned with source references and retrieval metadata.

The sequence diagram shows the difference between lexical retrieval and hybrid retrieval. In hybrid mode, the query is embedded first, then lexical and semantic scores are fused before prompt assembly.

Repository structure and core components

The implementation is compact and readable. The main files to understand are listed below.

src/config.js: retrieval defaults, paths, and model settings.
src/embeddingEngine.js: local ONNX embedding generation through Transformers.js.
src/vectorStore.js: SQLite storage plus lexical, semantic, and hybrid ranking.
src/chatEngine.js: retrieval mode resolution, prompt assembly, and Foundry Local model execution.
src/ingest.js: document ingestion and embedding generation during indexing.
src/server.js: REST endpoints, streaming endpoints, upload support, and health reporting.

Getting started

To run the sample, you need Node.js 20 or newer, Foundry Local, and a local ONNX embedding model. The default model path is models/embeddings/bge-small-en-v1.5.

cd c:\Users\leestott\local-hybrid-retrival-onnx npm install huggingface-cli download BAAI/bge-small-en-v1.5 --local-dir models/embeddings/bge-small-en-v1.5 npm run ingest npm start

Ingestion writes the local SQLite database to data/rag.db. If the embedding model is available, each chunk gets a dense vector as well as lexical features. If the embedding model is missing, ingestion still succeeds and the application remains usable in lexical mode.

Best practice: local AI applications should treat model files, SQLite data, and native runtime compatibility as part of the deployable system, not as optional developer conveniences.

Code walkthrough

1. Retrieval configuration

The sample makes its retrieval behaviour explicit in configuration. That is useful for testing and for operator visibility.

export const config = { model: "phi-3.5-mini", docsDir: path.join(ROOT, "docs"), dbPath: path.join(ROOT, "data", "rag.db"), chunkSize: 200, chunkOverlap: 25, topK: 3, retrievalMode: process.env.RETRIEVAL_MODE || "hybrid", retrievalModes: ["lexical", "semantic", "hybrid"], fallbackRetrievalMode: "lexical", retrievalWeights: { lexical: 0.45, semantic: 0.55, }, };

Those defaults tell you a lot about the intended operating profile. Chunks are small, the number of returned chunks is low, and the fallback path is explicit.

2. Local ONNX embeddings

The embedding engine disables remote model loading and only uses local files. That matters for privacy, repeatability, and air gapped operation.

env.allowLocalModels = true; env.allowRemoteModels = false; this.extractor = await pipeline("feature-extraction", resolvedPath, { local_files_only: true, }); const output = await this.extractor(text, { pooling: "mean", normalize: true, });

The mean pooling and normalisation step make the vectors suitable for cosine similarity based ranking.

3. Hybrid storage and ranking in SQLite

Instead of adding a separate vector database, the sample stores lexical and semantic representations in the same SQLite table. That keeps the local footprint low and the implementation easy to debug.

searchHybrid(query, queryEmbedding, topK = 5, weights = { lexical: 0.45, semantic: 0.55 }) { const lexicalResults = this.searchLexical(query, topK * 3); const semanticResults = this.searchSemantic(queryEmbedding, topK * 3); if (semanticResults.length === 0) { return lexicalResults.slice(0, topK).map((row) => ({ ...row, retrievalMode: "lexical", })); } const fused = [...combined.values()].map((row) => ({ ...row, score: (row.lexicalScore * lexicalWeight) + (row.semanticScore * semanticWeight), })); fused.sort((a, b) => b.score - a.score); return fused.slice(0, topK); }

The important point is not just the weighted fusion. It is the fallback behaviour. If semantic retrieval cannot provide results, the user still gets lexical grounding instead of an empty context window.

4. Retrieval mode resolution in ChatEngine

ChatEngine keeps the runtime behaviour predictable. It validates the requested mode and falls back to lexical search when semantic retrieval is unavailable.

resolveRetrievalMode(requestedMode) { const desiredMode = config.retrievalModes.includes(requestedMode) ? requestedMode : config.retrievalMode; if ((desiredMode === "semantic" || desiredMode === "hybrid") && !this.semanticAvailable) { return config.fallbackRetrievalMode; } return desiredMode; }

This is a sensible production design because local runtime failures are common. Missing model files or native dependency mismatches should reduce quality, not crash the entire assistant.

5. Foundry Local model management

The sample uses FoundryLocalManager to discover, download, cache, and load the configured chat model.

This gives the app a better local startup experience. The server can expose a status stream while the model initialises in the background.

User experience and screenshots

The client is intentionally simple, which makes it useful during evaluation. You can switch retrieval mode, test questions quickly, and inspect the retrieved sources.

The landing page exposes retrieval mode directly in the UI. That makes it easy to compare lexical, semantic, and hybrid behaviour during testing.

The sources panel shows grounding evidence and retrieval scores, which is useful when validating whether better answers are coming from better retrieval or just model phrasing.

Best practices for ONNX RAG and Foundry Local

Keep lexical fallback alive. Exact identifiers and runtime failures both make this necessary.
Persist sparse and dense features together where possible. It simplifies debugging and operational reasoning.
Use small chunks and conservative topK values for local context budgets.
Expose health and status endpoints so users can see when the model is still loading or embeddings are unavailable.
Test retrieval quality separately from generation quality.
Pin and validate native runtime dependencies, especially ONNX Runtime, before tuning prompts.

Practical warning: this repository already shows why runtime validation matters. A local app can ingest documents successfully and still fail at model initialisation if the native runtime stack is misaligned.

How this compares with RAG and CAG

The strongest value in this sample comes from where it sits between a basic local RAG baseline and a curated CAG design.

Dimension	Classic local RAG	This hybrid ONNX RAG sample	CAG
Context assembly	Retrieve chunks at query time, often lexically, then inject them into the prompt.	Retrieve chunks at query time with lexical, semantic, or fused scoring, then inject the strongest results into the prompt.	Use a prepared or cached context pack instead of fresh retrieval for every request.
Main strength	Easy to implement and easy to explain.	Better recall for paraphrases without giving up exact match behaviour or offline execution.	Predictable prompts and low query time overhead.
Main weakness	Misses synonyms and natural language reformulations.	More moving parts, larger local asset footprint, and native runtime compatibility to manage.	Coverage depends on curation quality and goes stale more easily.
Failure behaviour	Weak retrieval leads to weak grounding.	Semantic failure can degrade to lexical retrieval if designed properly, which this sample does.	Prepared context can be too narrow for new or unexpected questions.
Best fit	Simple local assistants and proof of concept systems.	Offline copilots and technical assistants that need stronger recall across varied phrasing.	Stable workflows with tightly bounded, curated knowledge.

Samples

Related samples:

- Foundry Local RAG - https://github.com/leestott/local-rag

- Foundry Local CAG - https://github.com/leestott/local-cag

- Foundry Local hybrid-retrival-onnx https://github.com/leestott/local-hybrid-retrival-onnx

Specific benefits of this hybrid approach over classic RAG

It captures paraphrased questions that lexical search would often miss.
It still preserves exact match performance for codes, terms, and product names.
It gives operators a controlled degradation path when the semantic stack is unavailable.
It stays local and inspectable without introducing a separate hosted vector service.

Specific differences from CAG

CAG shifts effort into context curation before the request. This sample retrieves evidence dynamically at runtime.
CAG can be faster for fixed workflows, but it is usually less flexible when the document set changes.
This hybrid RAG design is better suited to open ended knowledge search and growing document collections.

What to validate before shipping

Measure retrieval quality in each mode using exact term, acronym, and paraphrase queries.
Check that sources shown in the UI reflect genuinely distinct evidence, not repeated chunks.
Confirm the application remains usable when semantic retrieval is unavailable.
Verify ONNX Runtime compatibility on the real target machines, not only on the development laptop.
Test model download, cache, and startup behaviour with a clean environment.

Final take

For developers getting started with ONNX RAG and Foundry Local, this sample is a good technical reference because it demonstrates a realistic local architecture rather than a minimal demo. It shows how to build a grounded assistant that remains offline, supports multiple retrieval modes, and fails gracefully.

Compared with classic local RAG, the hybrid design provides better recall and better resilience. Compared with CAG, it remains more flexible for changing document sets and less dependent on pre curated context packs. If you want a practical starting point for offline grounded AI on developer workstations or edge devices, this is the most balanced pattern in the repository set.

Advice for Startups: Build Without Waiting with Replit

StudentDeveloperTeam — Wed, 25 Mar 2026 15:45:33 GMT

Before a single feature is tested, founders often find themselves setting up environments, choosing frameworks, and trying to learn just enough to get started. What should be a quick step forward turns into unnecessary delay.

For teams in the Microsoft Imagine Cup, that time matters. They are building in real time, refining their solutions while balancing everything else that comes with being a student founder.

During the Builder Series, Horacio Lopez from Replit walked through how founders can move from idea to product faster, sharing a build approach centered on starting immediately and learning along the way.

Start Before You Feel Ready

Early-stage founders often wait longer than they need to. They look for the right tools, the right language, or the right structure before taking action.

That hesitation slows progress.

Today, that model is shifting. Founders can begin with an idea and start building right away, using their tools to explore, test, and refine in real time. Instead of preparing to build, they are building to learn.

That shift creates momentum early, when it matters most.

Turn Building into Your Learning Process

Software development has traditionally followed a sequence: learn first, then build.

That sequence is changing.

With AI-supported environments, founders can understand how things work while they are actively creating them. They can ask questions within the build process, adjust in real time, and recognize patterns as they go.

This shortens the distance between idea and execution. It also allows founders to expand their capabilities without needing to master everything upfront.

Reduce Friction Across the Process

Startups move quickly, and founders often shift between roles throughout the day. Product, development, and deployment are no longer separate phases. They are part of a continuous flow.

When those steps are spread across disconnected tools, progress slows.

By bringing the build process into one environment, Replit helps reduce that friction. Founders can stay focused, move faster, and spend more time refining their product instead of managing setup and transitions.

From Idea to Product

Building a product will always require effort. That part does not change.

What is changing is how quickly founders can move from concept to something real.

When the barrier to building is lowered, more ideas can be tested and improved. Founders are no longer limited by how much they know before they begin, but by how quickly they are willing to start.

For teams in the Imagine Cup and beyond, that shift is meaningful. Because progress is no longer defined by preparation.

It is defined by action.