developer
273 TopicsFrom Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.Leverage AI for faster, more productive coding with GitHub Copilot
GitHub Copilot serves as an invaluable learning tool for developers, especially those who are still learning a particular programming language or framework. By providing context-aware suggestions, it helps learners understand the syntax, structure, and logic behind different code snippets. While fundamental knowledge of programming concepts and syntax services should remain a prerequisite, a developer exploring a new framework would be able to describe the functionality they want to implement, and GitHub Copilot will generate code that demonstrates how to achieve it. This accelerates the learning process and empowers developers to gain proficiency in new technologies more efficiently.AI Upskilling Framework Level 3 Building
The Global AI Community is excited to bring you the latest updates on AI Upskilling Framework Level 3 Building, straight from Microsoft Ignite! This session dives deep into advanced concepts for building agentic workflows and showcases new announcements that will help developers accelerate their Agentic AI journey.On‑Device AI with Windows AI Foundry and Foundry Local
From “waiting” to “instant”- without sending data away AI is everywhere, but speed, privacy, and reliability are critical. Users expect instant answers without compromise. On-device AI makes that possible: fast, private and available, even when the network isn’t - empowering apps to deliver seamless experiences. Imagine an intelligent assistant that works in seconds, without sending a text to the cloud. This approach brings speed and data control to the places that need it most; while still letting you tap into cloud power when it makes sense. Windows AI Foundry: A Local Home for Models Windows AI Foundry is a developer toolkit that makes it simple to run AI models directly on Windows devices. It uses ONNX Runtime under the hood and can leverage CPU, GPU (via DirectML), or NPU acceleration, without requiring you to manage those details. The principle is straightforward: Keep the model and the data on the same device. Inference becomes faster, and data stays local by default unless you explicitly choose to use the cloud. Foundry Local Foundry Local is the engine that powers this experience. Think of it as local AI runtime - fast, private, and easy to integrate into an app. Why Adopt On‑Device AI? Faster, more responsive apps: Local inference often reduces perceived latency and improves user experience. Privacy‑first by design: Keep sensitive data on the device; avoid cloud round trips unless the user opts in. Offline capability: An app can provide AI features even without a network connection. Cost control: Reduce cloud compute and data costs for common, high‑volume tasks. This approach is especially useful in regulated industries, field‑work tools, and any app where users expect quick, on‑device responses. Hybrid Pattern for Real Apps On-device AI doesn’t replace the cloud, it complements it. Here’s how: Standalone On‑Device: Quick, private actions like document summarization, local search, and offline assistants. Cloud‑Enhanced (Optional): Large-context models, up-to-date knowledge, or heavy multimodal workloads. Design an app to keep data local by default and surface cloud options transparently with user consent and clear disclosures. Windows AI Foundry supports hybrid workflows: Use Foundry Local for real-time inference. Sync with Azure AI services for model updates, telemetry, and advanced analytics. Implement fallback strategies for resource-intensive scenarios. Application Workflow Code Example using Foundry Local: 1. Only On-Device: Tries Foundry Local first, falls back to ONNX if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}") return "Error: No local AI available" 2. Hybrid approach: On-device first, cloud as last resort def get_answer(question, context): """ Priority order: 1. Foundry Local (best: advanced + private) 2. ONNX Runtime (good: fast + private) 3. Cloud API (fallback: requires internet, less private) # in case of Hybrid approach, based on real-time scenario """ if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}, trying cloud...") # Last resort: Cloud API (requires internet) if network_available(): try: import requests response = requests.post( '{BASE_URL_AI_CHAT_COMPLETION}', headers={'Authorization': f'Bearer {API_KEY}'}, json={ 'model': '{MODEL_NAME}', 'messages': [{ 'role': 'user', 'content': f'Context: {context}\n\nQuestion: {question}' }] }, timeout=10 ) answer = response.json()['choices'][0]['message']['content'] return answer, source="Cloud API (Online)" except Exception as e: return "Error: No AI runtime available", source="Failed" else: return "Error: No internet and no local AI available", source="Offline" Demo Project Output: Foundry Local answering context-based questions offline : The Foundry Local engine ran the Phi-4-mini model offline and retrieved context-based data. : The Foundry Local engine ran the Phi-4-mini model offline and mentioned that there is no answer. Practical Use Cases Privacy-First Reading Assistant: Summarize documents locally without sending text to the cloud. Healthcare Apps: Analyze medical data on-device for compliance. Financial Tools: Risk scoring without exposing sensitive financial data. IoT & Edge Devices: Real-time anomaly detection without network dependency. Conclusion On-device AI isn’t just a trend - it’s a shift toward smarter, faster, and more secure applications. With Windows AI Foundry and Foundry Local, developers can deliver experiences that respect user specific data, reduce latency, and work even when connectivity fails. By combining local inference with optional cloud enhancements, you get the best of both worlds: instant performance and scalable intelligence. Whether you’re creating document summarizers, offline assistants, or compliance-ready solutions, this approach ensures your apps stay responsive, reliable, and user-centric. References Get started with Foundry Local - Foundry Local | Microsoft Learn What is Windows AI Foundry? | Microsoft Learn https://devblogs.microsoft.com/foundry/unlock-instant-on-device-ai-with-foundry-local/Background tasks in .NET
What is a Background Task? A background task (or background service) is work that runs behind the scenes in an application without blocking the main user flow and often without direct user interaction. Think of it as a worker or helper that performs tasks independently while the main app continues doing other things. Problem Statement - What do you do when your downstream API is flaky or sometimes down for hours or even days , yet your UI and main API must stay responsive? Solution - This is a very common architecture problem in enterprise systems, and .NET gives us excellent tools to solve it cleanly: BackgroundService and exponential backoff retry logic. In this article, I’ll walk you through: A real production-like use case The architecture needed to make it reliable Why exponential backoff matters How to build a robust BackgroundService A full working code example The Use Case You have two APIs: API 1 : called frequently by the UI (hundreds or thousands of times). API 2 : a downstream system you must call, but it is known to be unstable, slow, or completely offline for long periods. If API 1 directly calls API 2: * Users experience lag * API 1 becomes slow or unusable * You overload API 2 with retries * Calls fail when API 2 is offline * You lose data What do we do then? Here goes the solution The Architecture Instead of calling API 2 synchronously, API 1 simply stores the intended call, and returns immediately. A BackgroundService will later: Poll for pending jobs Call API 2 Retry with exponential backoff if API 2 is still unavailable Mark jobs as completed when successful This creates a resilient, smooth, non-blocking system. Why Exponential Backoff? When a downstream API is completely offline, retrying every 1–5 seconds is disastrous: It wastes CPU and bandwidth It floods logs It overloads API 2 when it comes back online It burns resources Exponential backoff solves this. Examples retry delays: Retry 1 → 2 sec Retry 2 → 4 sec Retry 3 → 8 sec Retry 4 → 16 sec Retry 5 → 32 sec Retry 6 → 64 sec (Max delay capped at 5 minutes) This gives the system room to breathe. Complete Working Example (Using In-Memory Store) 1. The Model public class PendingJob { public Guid Id { get; set; } = Guid.NewGuid(); public string Payload { get; set; } = string.Empty; public int RetryCount { get; set; } = 0; public DateTime NextRetryTime { get; set; } = DateTime.UtcNow; public bool Completed { get; set; } = false; } 2. The In-Memory Store public interface IPendingJobStore { Task AddJobAsync(string payload); Task<List<PendingJob>> GetExecutableJobsAsync(); Task MarkJobAsCompletedAsync(Guid jobId); Task UpdateJobAsync(PendingJob job); } public class InMemoryPendingJobStore : IPendingJobStore { private readonly List<PendingJob> _jobs = new(); private readonly object _lock = new(); public Task AddJobAsync(string payload) { lock (_lock) { _jobs.Add(new PendingJob { Payload = payload, RetryCount = 0, NextRetryTime = DateTime.UtcNow }); } return Task.CompletedTask; } public Task<List<PendingJob>> GetExecutableJobsAsync() { lock (_lock) { return Task.FromResult(_jobs.Where(j => !j.Completed && j.NextRetryTime <= DateTime.UtcNow).ToList()); } } public Task MarkJobAsCompletedAsync(Guid jobId) { lock (_lock) { var job = _jobs.FirstOrDefault(j => j.Id == jobId); if (job != null) job.Completed = true; } return Task.CompletedTask; } public Task UpdateJobAsync(PendingJob job) => Task.CompletedTask; } 3. The BackgroundService with Exponential Backoff using System.Text; public class Api2RetryService : BackgroundService { private readonly IHttpClientFactory _clientFactory; private readonly IPendingJobStore _store; private readonly ILogger<Api2RetryService> _logger; public Api2RetryService(IHttpClientFactory clientFactory, IPendingJobStore store, ILogger<Api2RetryService> logger) { _clientFactory = clientFactory; _store = store; _logger = logger; } protected override async Task ExecuteAsync(CancellationToken stoppingToken) { _logger.LogInformation("Background retry service started."); while (!stoppingToken.IsCancellationRequested) { var jobs = await _store.GetExecutableJobsAsync(); foreach (var job in jobs) { var client = _clientFactory.CreateClient("api2"); try { var response = await client.PostAsync("/simulate", new StringContent(job.Payload, Encoding.UTF8, "application/json"), stoppingToken); if (response.IsSuccessStatusCode) { _logger.LogInformation("Job {JobId} processed successfully.", job.Id); await _store.MarkJobAsCompletedAsync(job.Id); } else { await HandleFailure(job); } } catch (Exception ex) { _logger.LogError(ex, "Error calling API 2."); await HandleFailure(job); } } await Task.Delay(TimeSpan.FromSeconds(5), stoppingToken); } } private async Task HandleFailure(PendingJob job) { job.RetryCount++; var delay = CalculateBackoff(job.RetryCount); job.NextRetryTime = DateTime.UtcNow.Add(delay); await _store.UpdateJobAsync(job); _logger.LogWarning("Retrying job {JobId} in {Delay}. RetryCount={RetryCount}", job.Id, delay, job.RetryCount); } private TimeSpan CalculateBackoff(int retryCount) { var seconds = Math.Pow(2, retryCount); var maxSeconds = TimeSpan.FromMinutes(5).TotalSeconds; return TimeSpan.FromSeconds(Math.Min(seconds, maxSeconds)); } } 4. The API 1 — Public Endpoint using System.Runtime.InteropServices; using System.Text.Json; [ApiController] [Route("api1")] public class Api1Controller : ControllerBase { private readonly IPendingJobStore _store; private readonly ILogger<Api1Controller> _logger; public Api1Controller(IPendingJobStore store, ILogger<Api1Controller> logger) { _store = store; _logger = logger; } [HttpPost("process")] public async Task<IActionResult> Process([FromBody] object data) { var payload = JsonSerializer.Serialize(data); await _store.AddJobAsync(payload); _logger.LogInformation("Stored job for background processing."); return Ok("Request received. Will process when API 2 recovers."); } } 5. The API 2 (Simulating Downtime) using System.Runtime.InteropServices; [ApiController][Route("api2")] public class Api2Controller: ControllerBase { private static bool shouldFail = true; [HttpPost("simulate")] public IActionResult Simulate([FromBody] object payload) { if (shouldFail) return StatusCode(503, "API 2 is down"); return Ok("API 2 processed payload"); } [HttpPost("toggle")] public IActionResult Toggle() { shouldFail = !shouldFail; return Ok($"API 2 failure mode = {shouldFail}"); } } 6. The Program.cs var builder = WebApplication.CreateBuilder(args); builder.Services.AddControllers(); builder.Services.AddSingleton<IPendingJobStore, InMemoryPendingJobStore>(); builder.Services.AddHttpClient("api2", c => { c.BaseAddress = new Uri("http://localhost:5000/api2"); }); builder.Services.AddHostedService<Api2RetryService>(); var app = builder.Build(); app.MapControllers(); app.Run(); Testing the Whole Flow #1 API 2 starts in failure mode All attempts will fail and exponential backoff kicks in. #2 Send a request to API 1 POST /api1/process { "name": "hello" } Job is stored. #3 Watch logs You’ll see: Retrying job in 2 seconds... Retrying job in 4 seconds... Retrying job in 8 seconds... ... #4 Bring API 2 back online: POST /api2/toggle Next retry will succeed: Job {id} processed successfully. Conclusion This pattern is extremely powerful for: Payment gateways ERP integrations Long-running partner APIs Unstable third-party services Internal microservices that spike or go offline References Background tasks with hosted services in ASP.NET CoreAzure Skilling at Microsoft Ignite 2025
The energy at Microsoft Ignite was unmistakable. Developers, architects, and technical decision-makers converged in San Francisco to explore the latest innovations in cloud technology, AI applications, and data platforms. Beyond the keynotes and product announcements was something even more valuable: an integrated skilling ecosystem designed to transform how you build with Azure. This year Azure Skilling at Microsoft Ignite 2025 brought together distinct learning experiences, over 150+ hands-on labs, and multiple pathways to industry-recognized credentials—all designed to help you master skills that matter most in today's AI-driven cloud landscape. Just Launched at Ignite Microsoft Ignite 2025 offered an exceptional array of learning opportunities, each designed to meet developers anywhere on the skilling journey. Whether you joined us in-person or on-demand in the virtual experience, multiple touchpoints are available to deepen your Azure expertise. Ignite 2025 is in the books, but you can still engage with the latest Microsoft skilling opportunities, including: The Azure Skills Challenge provides a gamified learning experience that lets you compete while completing task-based achievements across Azure's most critical technologies. These challenges aren't just about badges and bragging rights—they're carefully designed to help you advance technical skills and prepare for Microsoft role-based certifications. The competitive element adds urgency and motivation, turning learning into an engaging race against the clock and your peers. For those seeking structured guidance, Plans on Learn offer curated sets of content designed to help you achieve specific learning outcomes. These carefully assembled learning journeys include built-in milestones, progress tracking, and optional email reminders to keep you on track. Each plan represents 12-15 hours of focused learning, taking you from concept to capability in areas like AI application development, data platform modernization, or infrastructure optimization. The Microsoft Reactor Azure Skilling Series, running December 3-11, brings skilling to life through engaging video content, mixing regular programming with special Ignite-specific episodes. This series will deliver technical readiness and programming guidance in a livestream presentation that's more digestible than traditional documentation. Whether you're catching episodes live with interactive Q&A or watching on-demand later, you’ll get world-class instruction that makes complex topics approachable. Beyond Ignite: Your Continuous Learning Journey Here's the critical insight that separates Ignite attendees who transform their careers from those who simply collect swag: the real learning begins after the event ends. Microsoft Ignite is your launchpad, not your destination. Every module you start, every lab you complete, and every challenge you tackle connects to a comprehensive learning ecosystem on Microsoft Learn that's available 24/7, 365 days a year. Think of Ignite as your intensive immersion experience—the moment when you gain context, build momentum, and identify the skills that will have the biggest impact on your work. What you do in the weeks and months following determines whether that momentum compounds into career-defining expertise or dissipates into business as usual. For those targeting career advancement through formal credentials, Microsoft Certifications, Applied Skills and AI Skills Navigator, provide globally recognized validation of your expertise. Applied Skills focus on scenario-based competencies, demonstrating that you can build and deploy solutions, not simply answer theoretical questions. Certifications cover role-based scenarios for developers, data engineers, AI engineers, and solution architects. The assessment experiences include performance-based testing in dedicated Azure tenants where you complete real configuration and development tasks. And finally, the NEW AI Skills Navigator is an agentic learning space, bringing together AI-powered skilling experiences and credentials in a single, unified experience with Microsoft, LinkedIn Learning and GitHub – all in one spot Why This Matters: The Competitive Context The cloud skills race is intensifying. While our competitors offer robust training and content, Microsoft's differentiation comes not from having more content—though our 1.4 million module completions last fiscal year and 35,000+ certifications awarded speak to scale—but from integration of services to orchestrate workflows. Only Microsoft offers a truly unified ecosystem where GitHub Copilot accelerates your development, Azure AI services power your applications, and Azure platform services deploy and scale your solutions—all backed by integrated skilling content that teaches you to maximize this connected experience. When you continue your learning journey after Ignite, you're not just accumulating technical knowledge. You're developing fluency in an integrated development environment that no competitor can replicate. You're learning to leverage AI-powered development tools, cloud-native architectures, and enterprise-grade security in ways that compound each other's value. This unified expertise is what transforms individual developers into force-multipliers for their organizations. Start Now, Build Momentum, Never Stop Microsoft Ignite 2025 offered the chance to compress months of learning into days of intensive, hands-on experience, but you can still take part through the on-demand videos, the Global Ignite Skills Challenge, visiting the GitHub repos for the /Ignite25 labs, the Reactor Azure Skilling Series, and the curated Plans on Learn provide multiple entry points regardless of your current skill level or preferred learning style. But remember: the developers who extract the most value from Ignite are those who treat the event as the beginning, not the culmination, of their learning journey. They join hackathons, contribute to GitHub repositories, and engage with the Azure community on Discord and technical forums. The question isn't whether you'll learn something valuable from Microsoft Ignite 2025-that's guaranteed. The question is whether you'll convert that learning into sustained momentum that compounds over months and years into career-defining expertise. The ecosystem is here. The content is ready. Your skilling journey doesn't end when Ignite does—it accelerates.3.2KViews0likes0CommentsUnderstanding Small Language Modes
In Part 1, we discussed the differences between Large Language Models (LLMs) and Small Language Models (SLMs), highlighting how SLMs are transforming edge computing. Unlike LLMs, which rely on massive cloud infrastructure, SLMs can run directly on local devices like smartphones and IoT systems, offering faster performance, better privacy, and lower energy use. In this second part, we dive into how SLMs work, exploring their technical foundations and design principles that make them ideal for edge environments.🚀 Mission Agent Possible: Your Chance to Build, Solve, and Win at Microsoft Ignite 2025!
🔍 What’s Mission Agent Possible? It’s a contest designed for developers who love building intelligent solutions. Your mission: Create an AI Agent Solve a simulated crisis Showcase your skills to the world And yes—there are prizes! Top prizes like an Xbox and hundreds of dollars in Microsoft Store credit are reserved for in-person attendees. Global participants can still win recognition and a chance to be featured on the Model Mondays podcast. 👉 Contest details: https://aka.ms/ignite25/mission-agent 🧠 How Do You Choose the Right AI Model? Model selection is critical for building an effective agent. To help you succeed, check out our Model Selection Adventure blog: Learn how to identify the right problem Explore model strengths and trade-offs Test outputs using GitHub Models Playground This guide will give your agent the competitive edge it needs. 📖 Read more: https://aka.ms/models-blog ✅ Why Join? Showcase your skills to a global audience Learn hands-on techniques for AI agent development Win prizes and earn recognition 🔗 Ready to Accept the Mission? Don’t wait—start preparing now! The contest officially kicks off on November 18 and closes on November 20 (PST): 👉 https://aka.ms/ignite25/mission-agent 👉 https://aka.ms/models-blog Follow the conversation: https://aka.ms/ignite25/agent-contest/discord Share your progress on social with #MissionAgentPossible Please read through the eligibility guidance.Simplifying Microservice Reliability with Dapr
What is Dapr? Dapr is an open-source runtime developed by Microsoft that is used in building resilient, event-driven, and portable applications. It works using the sidecar pattern, meaning every microservice gets a small companion container — the Dapr sidecar — which handles communication, retries, secrets, state, and more. What is Sidecar ? A sidecar is a helper process that runs beside your app, handling system tasks so your code can focus on business logic. Lets see some offerings from Dapr along with examples. #1 . Bindings Connects your app to external systems (like queues, email, or storage) with zero SDK or protocol handling. Without Dapr ❌ var httpClient = new HttpClient(); await httpClient.PostAsJsonAsync("https://api.sendgrid.com/send", email); * Manage HTTP endpoints & credentials * Change provider → rewrite logic With Dapr ✅ await daprClient.InvokeBindingAsync("send-email", "create", email); * One call, no SDK * Replace SendGrid → SMTP → Twilio just by editing config * No code change, no redeploy How to enable binding in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Binding → choose the type of binding (e.g., azure.storagequeues). #2 . Configuration Centralizes app settings, allowing live configuration updates without redeploying services. Without Dapr ❌ var featureFlag = Configuration["FeatureX"]; * Requires redeploys for every config change * No centralized versioning or dynamic update With Dapr ✅ var config = await daprClient.GetConfiguration("appconfigstore", new[] { "FeatureX" }); * Use Azure App Config, Consul, or any provider * Centralized updates — no redeploys * Consistent access via Dapr SDK How to enable configuration in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Configuration→ choose the type of configuration (e.g., configuration.azure.appconfig). #3 . Pub/Sub Enables event-driven communication between microservices without needing to know each other's endpoints. Without Dapr ❌ var client = new ServiceBusClient("<connection-string>"); var sender = client.CreateSender("order-topic"); await sender.SendMessageAsync(new ServiceBusMessage(orderJson)); * Tied to Azure Service Bus * Must manage SDKs, connections, retries * Hard to switch to another broker (Kafka, RabbitMQ) With Dapr ✅ await daprClient.PublishEventAsync("pubsub", "order-created", order); * pubsub component defined in YAML (can be Kafka, Redis Streams, etc.) * No SDK, no broker dependency * Just publish the event — Dapr handles transport & retries How to enable pub/sub in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Pub/Sub→ choose the type of configuration (e.g., pubsub.azure.servicebus.topics). #4 . Secret Stores Securely retrieves credentials and secrets from vaults, keeping them out of configs and code. Without Dapr ❌ var connString = Configuration["ConnectionStrings:DB"]; * Secrets stored in configs or env vars * Risk of leaks and manual rotation With Dapr ✅ var secret = await daprClient.GetSecretAsync("vault", "dbConnection"); * Fetch directly from Azure Key Vault, AWS Secrets, etc. * No secrets in configs * Secure by default, consistent across services How to enable Secret Stores in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Secret stores→ choose the type of configuration (e.g., secretstores.azure.keyvault). #5 . State Provides a consistent way to store and retrieve application data across services using a simple API. Without Dapr ❌ var cosmosClient = new CosmosClient(connStr); var container = cosmosClient.GetContainer("db", "state"); await container.UpsertItemAsync(order); * Direct dependency on Cosmos DB * Manual retry logic * Tight coupling to storage type With Dapr ✅ await daprClient.SaveStateAsync("statestore", "order-101", order); var data = await daprClient.GetStateAsync<Order>("statestore", "order-101"); * Plug any backend (Redis, Cosmos, PostgreSQL) * Dapr handles retries and consistency * Same code, different backend — total flexibility How to enable State in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select State → choose the type of configuration (e.g., state.azure.cosmosdb). 🧩 Summary Think of Dapr as your invisible co-pilot for building distributed apps. It abstracts away all the repetitive plumbing — state management, pub/sub messaging, secret handling, and external bindings — letting you focus on writing features that matter. With Dapr, you don’t just write code that runs locally; you write code that just works across clouds, containers, and environments, without having to worry about wiring up retries, event delivery, or service-to-service communication manually. 🧰 Demo Source Code I've prepared complete sample on .Net core that touches all major Dapr features: * State Store * Pub/Sub * Bindings * Configuration * Secret Store You can explore it from Github-Dapr-Api Clone, run locally, and experiment — the project uses in-memory storage to keep things lightweight for testing and learning. 📚 References for Deep Dive Official Dapr Docs Dapr for .NET Developers — Microsoft Learn Dapr .NET SDK GitHubUnderstanding Small Language Modes
Small Language Models (SLMs) bring AI from the cloud to your device. Unlike Large Language Models that require massive compute and energy, SLMs run locally, offering speed, privacy, and efficiency. They’re ideal for edge applications like mobile, robotics, and IoT.