azure
7707 TopicsOn‑Device AI with Windows AI Foundry and Foundry Local
From “waiting” to “instant”- without sending data away AI is everywhere, but speed, privacy, and reliability are critical. Users expect instant answers without compromise. On-device AI makes that possible: fast, private and available, even when the network isn’t - empowering apps to deliver seamless experiences. Imagine an intelligent assistant that works in seconds, without sending a text to the cloud. This approach brings speed and data control to the places that need it most; while still letting you tap into cloud power when it makes sense. Windows AI Foundry: A Local Home for Models Windows AI Foundry is a developer toolkit that makes it simple to run AI models directly on Windows devices. It uses ONNX Runtime under the hood and can leverage CPU, GPU (via DirectML), or NPU acceleration, without requiring you to manage those details. The principle is straightforward: Keep the model and the data on the same device. Inference becomes faster, and data stays local by default unless you explicitly choose to use the cloud. Foundry Local Foundry Local is the engine that powers this experience. Think of it as local AI runtime - fast, private, and easy to integrate into an app. Why Adopt On‑Device AI? Faster, more responsive apps: Local inference often reduces perceived latency and improves user experience. Privacy‑first by design: Keep sensitive data on the device; avoid cloud round trips unless the user opts in. Offline capability: An app can provide AI features even without a network connection. Cost control: Reduce cloud compute and data costs for common, high‑volume tasks. This approach is especially useful in regulated industries, field‑work tools, and any app where users expect quick, on‑device responses. Hybrid Pattern for Real Apps On-device AI doesn’t replace the cloud, it complements it. Here’s how: Standalone On‑Device: Quick, private actions like document summarization, local search, and offline assistants. Cloud‑Enhanced (Optional): Large-context models, up-to-date knowledge, or heavy multimodal workloads. Design an app to keep data local by default and surface cloud options transparently with user consent and clear disclosures. Windows AI Foundry supports hybrid workflows: Use Foundry Local for real-time inference. Sync with Azure AI services for model updates, telemetry, and advanced analytics. Implement fallback strategies for resource-intensive scenarios. Application Workflow Code Example using Foundry Local: 1. Only On-Device: Tries Foundry Local first, falls back to ONNX if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}") return "Error: No local AI available" 2. Hybrid approach: On-device first, cloud as last resort def get_answer(question, context): """ Priority order: 1. Foundry Local (best: advanced + private) 2. ONNX Runtime (good: fast + private) 3. Cloud API (fallback: requires internet, less private) # in case of Hybrid approach, based on real-time scenario """ if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}, trying cloud...") # Last resort: Cloud API (requires internet) if network_available(): try: import requests response = requests.post( '{BASE_URL_AI_CHAT_COMPLETION}', headers={'Authorization': f'Bearer {API_KEY}'}, json={ 'model': '{MODEL_NAME}', 'messages': [{ 'role': 'user', 'content': f'Context: {context}\n\nQuestion: {question}' }] }, timeout=10 ) answer = response.json()['choices'][0]['message']['content'] return answer, source="Cloud API (Online)" except Exception as e: return "Error: No AI runtime available", source="Failed" else: return "Error: No internet and no local AI available", source="Offline" Demo Project Output: Foundry Local answering context-based questions offline : The Foundry Local engine ran the Phi-4-mini model offline and retrieved context-based data. : The Foundry Local engine ran the Phi-4-mini model offline and mentioned that there is no answer. Practical Use Cases Privacy-First Reading Assistant: Summarize documents locally without sending text to the cloud. Healthcare Apps: Analyze medical data on-device for compliance. Financial Tools: Risk scoring without exposing sensitive financial data. IoT & Edge Devices: Real-time anomaly detection without network dependency. Conclusion On-device AI isn’t just a trend - it’s a shift toward smarter, faster, and more secure applications. With Windows AI Foundry and Foundry Local, developers can deliver experiences that respect user specific data, reduce latency, and work even when connectivity fails. By combining local inference with optional cloud enhancements, you get the best of both worlds: instant performance and scalable intelligence. Whether you’re creating document summarizers, offline assistants, or compliance-ready solutions, this approach ensures your apps stay responsive, reliable, and user-centric. References Get started with Foundry Local - Foundry Local | Microsoft Learn What is Windows AI Foundry? | Microsoft Learn https://devblogs.microsoft.com/foundry/unlock-instant-on-device-ai-with-foundry-local/How to Move Azure DevOps Organization to New Organization
Dear Team, We are using our existing Azure DevOps (abc.net), now we want to move to new org. (abc.com) without losing history, work items etc. Are there any options without 3rd party tools. Kindly advise. Thanks & Regards, Shabin73Views0likes4CommentsMoving the Logic Apps Designer Forward
Today, we're excited to announce a major redesign of the Azure Logic Apps designer experience, now entering Public Preview for Standard workflows. While these improvements are currently Standard-only, our vision is to quickly extend them across all Logic Apps surfaces and SKUs. ⚠️ Important: As this is a Public Preview release, we recommend using these features for development and testing workflows rather than production workloads. We're actively stabilizing the experience based on your feedback. This Is Just the Beginning This is not us declaring victory and moving on. This is Phase I of a multi-phase journey, and I'm committed to sharing our progress through regular blog posts as we continue iterating. More importantly, we want to hear from you. Your feedback drives these improvements, and it will continue to shape what comes next. This redesign comes from listening to you—our customers—watching how you actually work, and adapting the designer to better fit your workflows. We've seen the pain points, heard the frustrations, and we're addressing them systematically. Our Roadmap: Three Phases Phase I: Perfecting the Development Loop (What we're releasing today) We're focused on making it cleaner and faster to edit your workflow, test it, and see the results. The development loop should feel effortless, not cumbersome. Phase II: Reimagining the Canvas Next, we'll rethink how the canvas works—introducing new shortcuts and workflows that make modifications easier and more intuitive. Phase III: Unified Experiences Across All Surfaces We'll ensure VS Code, Consumption, and Standard all deliver similarly powerful flows, regardless of where you're working. Beyond these phases, we have several standalone improvements planned: a better search experience, streamlined connection creation and management, and removing unnecessary overhead when creating new workflows. We're also tackling fundamental questions that shouldn't be barriers: What do stateful and stateless mean? Why can't you switch between them? Why do you have to decide upfront if something is an agent? You shouldn't. We're working toward making these decisions dynamic—something you can change directly in the designer as you build, not rigid choices you're locked into at creation time. We want to make it easier to add agentic capabilities to any workflow, whenever you need them. What's New in Phase I Let me walk you through what we're shipping at Ignite. Faster Onboarding: Get to Building Sooner We're removing friction from the very beginning. When you create a new workflow, you'll get to the designer before having to choose stateful, stateless, or agentic. Eventually, we want to eliminate that upfront choice entirely—making it a decision you can defer until after your workflow is created. This one still needs a bit more work, but it's coming soon. One View to Rule Them All We've removed the side panel. Workflows now exist in a single, unified view with all the tooling you need. No more context switching. You can easily hop between run history, code view, or visual editor, and change your settings inline—all without leaving your workflow. Draft Mode: Auto-Save Without the Risk Here's one of our biggest changes: draft mode with auto-save. We know the best practice is to edit locally in VS Code, store workflows in GitHub, and deploy properly to keep editing separate from production. But we also know that's not always possible or practical for everyone. It sucks to get your workflow into the perfect state, then lose everything if something goes wrong before you hit save. Now your workflow auto-saves every 10 seconds in draft mode. If you refresh the window, you're right back where you were—but your changes aren't live in production. There's now a separate Publish action that promotes your draft to production. This means you can work, test your workflow against the draft using the designer tools, verify everything works, and then publish to production—even when editing directly on the resource. Another benefit: draft saves won't restart your app. Your app keeps running. Restarts only happen when you publish. Smarter, Faster Search We've reorganized how browsing works—no more getting dropped into an endless list of connectors. You now get proper guidance as you explore and can search directly for what you need. Even better, we're moving search to the backend in the coming weeks, which will eliminate the need to download information about thousands of connectors upfront and deliver instant results. Our goal: no search should ever feel slow. Document Your Workflows with Notes You can now add sticky notes anywhere in your workflow. Drop a post-it note, add markdown (yes, even YouTube videos), and document your logic right on the canvas. We have plans to improve this with node anchoring and better stability features, but for now, you can visualize and explain your workflows more clearly than ever. Unified Monitoring and Run History Making the development loop smoother means keeping everything in one place. Your run history now lives on the same page as your designer. Switch between runs without waiting for full blade reloads. We've also added the ability to view both draft and published runs—a powerful feature that lets you test and validate your changes before they go live. We know there's a balance between developer and operator personas. Developers need quick iteration and testing capabilities, while operators need reliable monitoring and production visibility. This unified view serves both: developers can test draft runs and iterate quickly, while the clear separation between draft and published runs ensures operators maintain full visibility into what's actually running in production. New Timeline View for Better Debugging We experimented with a timeline concept in Agentic Apps to explain handoff—Logic Apps' first foray into cyclic graphs. But it was confusing and didn't work well with other Logic App types. We've refined it. On the left-hand side, you'll now see a hierarchical view of every action your Logic App ran, in execution order. This makes navigation and debugging dramatically easier when you're trying to understand exactly what happened during a run. What's Next This is Phase I. We're shipping these improvements, but we're not stopping here. As we move into Phase II and beyond, I'll continue sharing updates through blog posts like this one. How to Share Your Feedback We're actively listening and want to hear from you: Use the feedback button in the Azure Portal designer Join the discussion in GitHub/Community Forum – https://github.com/Azure/LogicAppsUX Comment below with your thoughts and suggestions Your input directly shapes our roadmap and priorities. Keep the feedback coming. It's what drives these changes, and it's what will shape the future of Azure Logic Apps. Let's build something great together.678Views5likes2CommentsCloud Native Identity with Azure Files: Entra-only Secure Access for the Modern Enterprise
Azure Files introduces Entra only identities authentication for SMB shares, enabling cloud-only identity management without reliance on on-premises Active Directory. This advancement supports secure, seamless access to file shares from anywhere, streamlining cloud migration and modernization, and reducing operational complexity and costs.6.3KViews7likes12CommentsUnlocking Your First AI Solution on Azure: Practical Paths for Developers of All Backgrounds
Over the past several months, I’ve spent hundreds of hours working directly with teams—from small startups to mid-market innovators—who share the same aspiration: “We want to use AI, but where do we start?” This question comes up everywhere. It crosses industries, geographies, skill levels, and team sizes. And as developers, we often feel the pressure to “solve AI” end-to-end—model selection, prompt engineering, security, deployment pipelines, integration…. The list is long, and the learning curve can feel even longer. But here’s what we’ve learned through our work in the SMB space and what we recently shared at Microsoft Ignite (Session OD1210). The first mile of AI doesn’t have to be complex. You don’t need an army of engineers, and you don’t need to start from scratch. You just need the right path. In our Ignite on-demand session with UnifyCloud, we demonstrated two fast, developer-friendly ways to get your first AI workload running on Azure—both grounded in real-world patterns we see every day. Path 1: Build Quickly with Microsoft Foundry Templates Microsoft Foundry gives developers pre-built, customizable templates that dramatically reduce setup time. In the session, I walked through how to deploy a fully functioning AI chatbot using: Azure AI Foundry GitHub (via the Azure Samples “Get Started with AI Chat” repo) Azure Cloudshell for deployment And zero specialized infra prep With five lines of code and a few clicks, you can spin up a secure internal chatbot tailored for your business. Want responses scoped to your internal content? Want control over the model, costs, or safety filters? Want to plug in your own data sources like SharePoint, Blob Storage, or uploaded docs? You can do all of that—easily and on your terms. This “build fast” path is ideal for: Developers who want control and extensibility Teams validating AI use cases Scenarios where data governance matters Lightweight experimentation without heavy architecture upfront And most importantly, you can scale it later. Path 2: Buy a Production-Ready Solution from a Trusted Partner Not every team wants to build. Not every team has the time, the resources, or the desire to compose their own AI stack. That’s why we showcased the “buy” path with UnifyCloud’s AI Factory, a Marketplace-listed solution that lets customers deploy mature AI capabilities directly into their Azure environment, complete with optional support, management, and best practices. In the demo, UnifyCloud’s founder Vivek Bhatnagar walked through: How to navigate Microsoft Marketplace How to evaluate solution listings How to review pricing plans and support tiers How to deploy a partner-built AI app with just a few clicks How customers can accelerate their time to value without implementation overhead This path is perfect when you want: A production-ready AI solution A supported, maintained experience Minimal engineering lift Faster time to outcome Why Azure? Why Now? During the session, we also outlined three reasons developers are choosing Azure for their first AI workloads: 1. Secure, governed, safe by design Azure mitigates risk with always-on guardrails and built-in commitments to security, privacy, and policy-based control. 2. Built for production with a complete AI platform From models to agents to tools and data integrations, Azure provides an enterprise-grade environment developers can trust. 3. Developer-first innovation with agentic DevOps Azure puts developers at the center, integrating AI across the software development lifecycle to help teams build faster and smarter. The Session: Build or Buy—Two Paths, One Goal Whether you build using Azure AI Foundry or buy through Marketplace, the goal is the same: Help teams get to their first AI solution quickly, confidently, and securely. You don’t need a massive budget. You don’t need deep ML experience. You don’t need a full-time AI team. What you need is a path that matches your skills, your constraints, and your timeline. Watch the Full Ignite Session You can watch the full session on-demand now also on YouTube: OD1201 — “Unlock Your First AI Solution on Azure” It includes: The full build and buy demos Partner perspectives Deployment walkthroughs And guidance you can take back to your teams today If you want to explore the same build path we showed at Ignite: ➡️ Azure Samples – Get Started with AI Chat https://github.com/Azure-Samples/get-started-with-ai-chat Deploy it, customize it, attach your data sources, and extend it. It’s a great starting point. If you’re curious about the Marketplace path: ➡️ Search for “UnifyCloud AI Factory” on Microsoft Marketplace You’ll see support offerings, solution details, and deployment instructions. Closing Thought The gap between wanting to adopt AI and actually running AI in production is shrinking fast. Azure makes it possible for teams, especially those without deep AI experience, to take meaningful steps today. No perfect architecture required. No million-dollar budget. No wait for a future-state roadmap. Just two practical paths: Build quickly. Buy confidently. Start now. If you have questions, ideas, or want to share what you’re building, feel free to reach out here in the Developer Community. I’d love to hear what you’re creating. — Joshua Huang Microsoft AzureBackground tasks in .NET
What is a Background Task? A background task (or background service) is work that runs behind the scenes in an application without blocking the main user flow and often without direct user interaction. Think of it as a worker or helper that performs tasks independently while the main app continues doing other things. Problem Statement - What do you do when your downstream API is flaky or sometimes down for hours or even days , yet your UI and main API must stay responsive? Solution - This is a very common architecture problem in enterprise systems, and .NET gives us excellent tools to solve it cleanly: BackgroundService and exponential backoff retry logic. In this article, I’ll walk you through: A real production-like use case The architecture needed to make it reliable Why exponential backoff matters How to build a robust BackgroundService A full working code example The Use Case You have two APIs: API 1 : called frequently by the UI (hundreds or thousands of times). API 2 : a downstream system you must call, but it is known to be unstable, slow, or completely offline for long periods. If API 1 directly calls API 2: * Users experience lag * API 1 becomes slow or unusable * You overload API 2 with retries * Calls fail when API 2 is offline * You lose data What do we do then? Here goes the solution The Architecture Instead of calling API 2 synchronously, API 1 simply stores the intended call, and returns immediately. A BackgroundService will later: Poll for pending jobs Call API 2 Retry with exponential backoff if API 2 is still unavailable Mark jobs as completed when successful This creates a resilient, smooth, non-blocking system. Why Exponential Backoff? When a downstream API is completely offline, retrying every 1–5 seconds is disastrous: It wastes CPU and bandwidth It floods logs It overloads API 2 when it comes back online It burns resources Exponential backoff solves this. Examples retry delays: Retry 1 → 2 sec Retry 2 → 4 sec Retry 3 → 8 sec Retry 4 → 16 sec Retry 5 → 32 sec Retry 6 → 64 sec (Max delay capped at 5 minutes) This gives the system room to breathe. Complete Working Example (Using In-Memory Store) 1. The Model public class PendingJob { public Guid Id { get; set; } = Guid.NewGuid(); public string Payload { get; set; } = string.Empty; public int RetryCount { get; set; } = 0; public DateTime NextRetryTime { get; set; } = DateTime.UtcNow; public bool Completed { get; set; } = false; } 2. The In-Memory Store public interface IPendingJobStore { Task AddJobAsync(string payload); Task<List<PendingJob>> GetExecutableJobsAsync(); Task MarkJobAsCompletedAsync(Guid jobId); Task UpdateJobAsync(PendingJob job); } public class InMemoryPendingJobStore : IPendingJobStore { private readonly List<PendingJob> _jobs = new(); private readonly object _lock = new(); public Task AddJobAsync(string payload) { lock (_lock) { _jobs.Add(new PendingJob { Payload = payload, RetryCount = 0, NextRetryTime = DateTime.UtcNow }); } return Task.CompletedTask; } public Task<List<PendingJob>> GetExecutableJobsAsync() { lock (_lock) { return Task.FromResult(_jobs.Where(j => !j.Completed && j.NextRetryTime <= DateTime.UtcNow).ToList()); } } public Task MarkJobAsCompletedAsync(Guid jobId) { lock (_lock) { var job = _jobs.FirstOrDefault(j => j.Id == jobId); if (job != null) job.Completed = true; } return Task.CompletedTask; } public Task UpdateJobAsync(PendingJob job) => Task.CompletedTask; } 3. The BackgroundService with Exponential Backoff using System.Text; public class Api2RetryService : BackgroundService { private readonly IHttpClientFactory _clientFactory; private readonly IPendingJobStore _store; private readonly ILogger<Api2RetryService> _logger; public Api2RetryService(IHttpClientFactory clientFactory, IPendingJobStore store, ILogger<Api2RetryService> logger) { _clientFactory = clientFactory; _store = store; _logger = logger; } protected override async Task ExecuteAsync(CancellationToken stoppingToken) { _logger.LogInformation("Background retry service started."); while (!stoppingToken.IsCancellationRequested) { var jobs = await _store.GetExecutableJobsAsync(); foreach (var job in jobs) { var client = _clientFactory.CreateClient("api2"); try { var response = await client.PostAsync("/simulate", new StringContent(job.Payload, Encoding.UTF8, "application/json"), stoppingToken); if (response.IsSuccessStatusCode) { _logger.LogInformation("Job {JobId} processed successfully.", job.Id); await _store.MarkJobAsCompletedAsync(job.Id); } else { await HandleFailure(job); } } catch (Exception ex) { _logger.LogError(ex, "Error calling API 2."); await HandleFailure(job); } } await Task.Delay(TimeSpan.FromSeconds(5), stoppingToken); } } private async Task HandleFailure(PendingJob job) { job.RetryCount++; var delay = CalculateBackoff(job.RetryCount); job.NextRetryTime = DateTime.UtcNow.Add(delay); await _store.UpdateJobAsync(job); _logger.LogWarning("Retrying job {JobId} in {Delay}. RetryCount={RetryCount}", job.Id, delay, job.RetryCount); } private TimeSpan CalculateBackoff(int retryCount) { var seconds = Math.Pow(2, retryCount); var maxSeconds = TimeSpan.FromMinutes(5).TotalSeconds; return TimeSpan.FromSeconds(Math.Min(seconds, maxSeconds)); } } 4. The API 1 — Public Endpoint using System.Runtime.InteropServices; using System.Text.Json; [ApiController] [Route("api1")] public class Api1Controller : ControllerBase { private readonly IPendingJobStore _store; private readonly ILogger<Api1Controller> _logger; public Api1Controller(IPendingJobStore store, ILogger<Api1Controller> logger) { _store = store; _logger = logger; } [HttpPost("process")] public async Task<IActionResult> Process([FromBody] object data) { var payload = JsonSerializer.Serialize(data); await _store.AddJobAsync(payload); _logger.LogInformation("Stored job for background processing."); return Ok("Request received. Will process when API 2 recovers."); } } 5. The API 2 (Simulating Downtime) using System.Runtime.InteropServices; [ApiController][Route("api2")] public class Api2Controller: ControllerBase { private static bool shouldFail = true; [HttpPost("simulate")] public IActionResult Simulate([FromBody] object payload) { if (shouldFail) return StatusCode(503, "API 2 is down"); return Ok("API 2 processed payload"); } [HttpPost("toggle")] public IActionResult Toggle() { shouldFail = !shouldFail; return Ok($"API 2 failure mode = {shouldFail}"); } } 6. The Program.cs var builder = WebApplication.CreateBuilder(args); builder.Services.AddControllers(); builder.Services.AddSingleton<IPendingJobStore, InMemoryPendingJobStore>(); builder.Services.AddHttpClient("api2", c => { c.BaseAddress = new Uri("http://localhost:5000/api2"); }); builder.Services.AddHostedService<Api2RetryService>(); var app = builder.Build(); app.MapControllers(); app.Run(); Testing the Whole Flow #1 API 2 starts in failure mode All attempts will fail and exponential backoff kicks in. #2 Send a request to API 1 POST /api1/process { "name": "hello" } Job is stored. #3 Watch logs You’ll see: Retrying job in 2 seconds... Retrying job in 4 seconds... Retrying job in 8 seconds... ... #4 Bring API 2 back online: POST /api2/toggle Next retry will succeed: Job {id} processed successfully. Conclusion This pattern is extremely powerful for: Payment gateways ERP integrations Long-running partner APIs Unstable third-party services Internal microservices that spike or go offline References Background tasks with hosted services in ASP.NET CoreGuide for Architecting Azure-Databricks: Design to Deployment
Author's: Chris Walk cwalk, Dan Johnson danjohn1234, Eduardo dos Santos eduardomdossantos, Ted Kim tekim, Eric Kwashie ekwashie, Chris Hayes Chris_Haynes, Tayo Akigbogun takigbogun and Rafia Aqil Rafia_Aqil Peer Reviewed: Mohamed Sharaf mohamedsharaf Note: This article does not cover the Serverless Workspace option, which is currently in Private Preview. We plan to update this article once Serverless Workspaces are Generally Available. Also, while Terraform is the recommended method for production deployments due to its automation and repeatability, for simplicity in this article we will demonstrate deployment through the Azure portal. DESIGN: Architecting a Secure Azure Databricks Environment Step 1: Plan Workspace, Subscription Organization, Analytics Architecture and Compute Planning your Azure Databricks environment can follow various arrangements depending on your organization’s structure, governance model, and workload requirements. The following guidance outlines key considerations to help you design a well-architected foundation. 1.1 Align Workspaces with Business Units A recommended best practice is to align each Azure Databricks workspace with a specific business unit. This approach—often referred to as the “Business Unit Subscription” design pattern—offers several operational and governance advantages. Streamlined Access Control: Each unit manages its own workspace, simplifying permissions and reducing cross-team access risks. For example, Sales can securely access only their data and notebooks. Cost Transparency: Mapping workspaces to business units enables accurate cost attribution and supports internal chargeback models. Each workspace can be tagged to a cost center for visibility and accountability. Even within the same workspace, costs can be controlled using system tables that provide detailed usage metrics and resource consumption insights. Challenges to keep-in-mind: While per-BU workspaces have high impact, be mindful of workspace sprawl. If every small team spins up its own workspace, you might end up with dozens or hundreds of workspaces, which introduces management overhead. Databricks recommends a reasonable upper limit (on Azure, roughly 20–50 workspaces per account/subscription) because managing “collaboration, access, and security across hundreds of workspaces can become extremely difficult, even with good automation” [1]. Each workspace will need governance (user provisioning, monitoring, compliance checks), so there is a balance to strike. 1.2 Workspace Alignment and Shared Metastore Strategy As you align workspaces with business units, it's essential to understand how Unity Catalog and the metastore fit into your architecture. Unity Catalog is Databricks’ unified governance layer that centralizes access control, auditing, and data lineage across workspaces. Each Unity Catalog is backed by a metastore, which acts as the central metadata repository for tables, views, volumes, and other data assets. In Azure Databricks, you can have one metastore per region, and all workspaces within that region share it. This enables consistent governance and simplifies data sharing across teams. If your organization spans multiple regions, you’ll need to plan for cross-region sharing, which Unity Catalog supports through Delta Sharing. By aligning workspaces with business units and connecting them to a shared metastore, you ensure that governance policies are enforced uniformly, while still allowing each team to manage its own data assets securely and independently. 1.3 Distribute Workspaces Across Subscriptions When scaling Azure Databricks, consider not just the number of workspaces, but also how to distribute them across Azure subscriptions. Using multiple Azure subscriptions can serve both organizational needs and technical requirements: Environment Segmentation (Dev/Test/Prod): A common pattern is to put production workspaces in a separate Azure subscription from development or test workspaces. This provides an extra layer of isolation. Microsoft highly recommends separating workspaces into prod and dev, in separate subscriptions. This way, you can apply stricter Azure policies or network rules to the prod subscription and keep the dev subscription a bit more open for experimentation without risking prod resources. Honor Azure Resource Limits: Azure subscriptions come with certain capacity limits and Azure Databricks workspaces have their own limits (since it’s a multi-tenant PaaS). If you put all workspaces in one subscription, or all teams in one workspace, you might hit those limits. Most enterprises naturally end up with multiple subscriptions as they grow – planning this early avoids later migration headaches. If you currently have everything in one subscription, evaluate usage and consider splitting off heavy workloads or prod workloads into a new one to adhere to best practices. 1.4 Consider Completing Azure Landing Zone Assessment When evaluating and planning your next deployment, it’s essential to ensure that your current landing zone aligns with Microsoft best practices. This helps establish a robust Databricks architecture and minimizes the risk of avoidable issues. Additionally, customers who are early in their cloud journey can benefit from Cloud Assessments—such as an Azure Landing Zone Review and a review of the “Prepare for Cloud Adoption” documentation—to build a strong foundation. 1.5 Planning Your Azure Databricks Workspace Architecture Your workspace architecture should reflect the operational model of your organization and support the workloads you intend to run, from exploratory notebooks to production-grade ETL pipelines. To support your planning, Microsoft provides several reference architectures that illustrate well-architected patterns for Databricks deployments. These solution ideas can serve as starting points for designing maintainable environments: Simplified Architecture: Modern Data Platform Architecture, ETL-Intensive Workload Reference Architecture: Building ETL Intensive Architecture, End-to-End Analytics Architecture: Create a Modern Analytics Architecture. 1.6 Planning for that “Right” Compute Choosing the right compute setup in Azure Databricks is crucial for optimizing performance and controlling costs, as billing is based on Databricks Units (DBUs) using a per-second pricing model. Classic Compute: You can fine-tune your own compute by enabling auto-termination and autoscaling, using Photon acceleration, leveraging spot instances, selecting the right VM type and node count for your workload, and choosing SSDs for performance or HDDs for archival storage. Preferred by mature internal teams and developers who need advanced control over clusters—such as custom VM selection, tuning, and specialized configurations. Serverless Compute: Alternatively, managed services can simplify operations with built-in optimizations. Removes infrastructure management and offers instant scaling without cluster warm-up, making it ideal for agility and simplicity. Step 2: Plan the “Right” CIDR Range (Classic Compute) Note: You can skip this step if you plan to use serverless compute for all your resources, as CIDR range planning is not required in serverless deployments. When planning CIDR ranges for your Azure Databricks workspace, it's important to ensure your virtual network has enough IP address capacity to support cluster scaling. Why this matters: If you choose a small VNet address space and your analytics workloads grow, you might hit a ceiling where you simply cannot launch more clusters or scale-out because there are no free IPs in the subnet. The subnet sizes—and by extension, the VNet CIDR—determine how many nodes you can. Databricks recommends using a CIDR block between /16 and /24 for the VNet, and up to /26 for the two required subnets: the container subnet and the host subnet. Here’s a reference Microsoft provides. If your current workspace’s VNet lacks sufficient IP space for active cluster nodes, you can request a CIDR range update through your Azure Databricks account team as noted in the Microsoft documentation. 2.1 Considerations for CIDR Range Workload Type & Concurrency: Consider what kinds of workloads will run (ETL Pipelines, Machine Learning Notebooks, BI Dashboards, etc.) and how many jobs or clusters may need to run in parallel. High concurrency (e.g. multiple ETL jobs or many interactive clusters) means more nodes running at the same time, requiring a larger pool of IP addresses. Data Volume (Historical vs. Incremental): Are you doing a one-time historical data load or only processing new incremental data? A large backfill of terabytes of data may require spinning up a very large cluster (hundreds of nodes) to process in a reasonable time. Ongoing smaller loads might get by with fewer nodes. Estimate how much data needs processing. Transformation Complexity: The complexity of data transformations or machine learning workloads matters. Heavy transformations (joins, aggregations on big data) or complex model training can benefit more workers. If your use cases include these, you may need larger clusters (more nodes) to meet performance SLAs, which in turn demands more IP addresses available in the subnet. Data Sources and Integration: Consider how your Databricks environment will connect to data. If you have multiple data sources or sinks (e.g. ingest from many event hubs, databases, or IoT streams), you might design multiple dedicated clusters or workflows, potentially all active at once. Also, if using separate job clusters per job (Databricks Jobs), multiple clusters might launch concurrently. All these scenarios increase concurrent node count. 2.2 Configuring a Dedicated Network (VNet) per Workspace with Egress Control By default, Azure Databricks deploys its classic compute resources into a Microsoft-managed virtual network (VNet) within your Azure subscription. While this simplifies setup, it limits control over network configuration. For enhanced security and flexibility, it's recommended to use VNet Injection, which allows you to deploy the compute plane into your own customer-managed VNet. This approach enables secure integration with other Azure services using service endpoints or private endpoints, supports user-defined routes for accessing on-premises data sources, allows traffic inspection via network virtual appliances or firewalls, and provides the ability to configure custom DNS and enforce egress restrictions through network security group (NSG) rules. Within this VNet (which must reside in the same region and subscription as the Azure Databricks workspace), two subnets are required for Azure Databricks: a container subnet (referred to as private subnet) and a host subnet (referred to as public subnet). To implement front-end Private Link, back-end Private Link, or both, your workspace VNet needs a third subnet that will contain the private endpoint (PrivateLink subnet). It is recommended to also deploy an Azure Firewall for egress control. Step 3: Plan Network Architecture for Securing Azure-Databricks 3.1 Secure Cluster Connectivity Secure Cluster Connectivity, also known as No Public IP (NPIP), is a foundational security feature for Azure Databricks deployments. When enabled, it ensures that compute resources within the customer-managed virtual network (VNet) do not have public IP addresses, and no inbound ports are exposed. Instead, each cluster initiates a secure outbound connection to the Databricks control plane using port 443 (HTTPS), through a dedicated relay. This tunnel is used exclusively for administrative tasks, separate from the web application and REST API traffic, significantly reducing the attack surface. For the most secure deployment, Microsoft and Databricks strongly recommend enabling Secure Cluster Connectivity, especially in environments with strict compliance or regulatory requirements. When Secure Cluster Connectivity is enabled, both workspace subnets become private, as cluster nodes don’t have public IP addresses. 3.2 Egress with VNet Injection (NVA) For Databricks traffic, you’ll need to assign a UDR to the Databricks-managed VNet with a next hop type of Network Virtual Appliance (NVA)—this could be an Azure Firewall, NAT Gateway, or another routing device. For control plane traffic, Databricks recommends using Azure service tags, which are logical groupings of IP addresses for Azure services and should be routed with the next hop type of internet. This is important because Azure IP ranges can change frequently as new resources are provisioned, and manually maintaining IP lists is not practical. Using service tags ensures that your routing rules automatically stay up to date. 3.3 Front-End Connectivity with Azure Private Link (Standard Deployment) To further enhance security, Azure Databricks supports Private Link for front-end connections. In a standard deployment, Private Link enables users to access the Databricks web application, REST API, and JDBC/ODBC endpoints over a private VNet interface, bypassing the public internet. For organizations with no public internet access from user networks, a browser authentication private endpoint is required. This endpoint supports SSO login callbacks from Microsoft Entra ID and is shared across all workspaces in a region using the same private DNS zone. It is typically hosted in a transit VNet that bridges on-premises networks and Azure. Note: There are two deployment types: standard and simplified. To compare these deployment types, see Choose standard or simplified deployment. 3.4 Serverless Compute Networking Azure Databricks offers serverless compute options that simplify infrastructure management and accelerate workload execution. These resources run in a Databricks-managed serverless compute plane, isolated from the public internet and connected to the control plane via the Microsoft backbone network. To secure outbound traffic from serverless workloads, administrators can configure Serverless Egress Control using network policies that restrict connections by location, FQDN, or Azure resource type. Additionally, Network Connectivity Configurations (NCCs) allow centralized management of private endpoints and firewall rules. NCCs can be attached to multiple workspaces and are essential for enabling secure access to Azure services like Data Lake Storage from serverless SQL warehouses. DEPLOYMENT: Step-to-Step Implementation using Azure Portal Step 1: Create an Azure Resource Group For each new workspace, create a dedicated Resource Group (to contain the Databricks workspace resource and associated resources). Ensure that all resources are deployed in the same Region and Resource Group (i.e. workspace, subnets...) to optimize data movement performance and enhance security. Step 2: Deploy Workspace Specific Virtual Network (VNET) From your Resource Group, create a Virtual Network. Under the Security section, enable Azure Firewall. Deploying an Azure Firewall is recommended for egress control, ensuring that outbound traffic from your Databricks environment is securely managed. Define address spaces for your Virtual Network (Review Step 2 from Design). As documented, you could create a VNet with these values: IP range: First remove the default IP range, and then add IP range 10.28.0.0/23. Create subnet public-subnet with range 10.28.0.0/25. Create subnet private-subnet with range 10.28.0.128/25. Create subnet private-link with range 10.28.1.0/27. Please note: your IP values can be different depending on your IPAM and available scopes. Review + Create your Virtual Network. Step 3: Deploy Azure-Databricks Workspace: Now that networking is in place, create the Databricks workspace. Below are detailed steps your organization should review while creating workspace creation: In Azure Portal, search for Azure Databricks and click Create. Choose the Subscription, RG, Region, select Premium, enter in “Managed Resource Group name” and click Next. Managed Resource Group- will be created after your Databrick workspace is deployed and contains infrastructure resources for the workspace i.e. VNets, DBFS. Required: Enable “Secure Cluster Connectivity” (No Public IP for clusters), to ensure that Databricks clusters are deployed without public IP addresses (Review Section 3.1). Required: Enable the option to deploy into your Virtual Network (VNet Injection), also known as “Bring Your Own VNet” (Review Section 3.2). Select the Virtual Network created in Step 2. Enter Private, Public Subnet Names. Enable or Disable “Deploying Nat Gateway”, according to your workspace requirement. Disable “Allow Public Network Access”. Select “No Azure Databricks Rules” for Required NSG Rules. Select “Click on add to create a private endpoint”, this will open a panel for private endpoint setup. Click “Add” to enter your Private Link details created in Step 2. Also, ensure that Private DNS zone integration is set to “Yes” and that a new Private DNS Zone is created, indicated by (New)privatelink.azuredatabricks.net. Unless an existing DNS zone for this purpose already exists. (Optional) Under Encryption Tab, Enable Infrastructure Encryption, if you have requirement for FIPS 140-2. It comes at a cost, it takes time to encrypt and decrypt. By default your data is already encrypted. If you have a standard regulatory requirement (ex. HIPAA). (Optional) Compliance security profile- for HIPAA. (Optional) Automatic cluster updates, First Sunday of every Month. Review + Create the workspace and wait for it to deploy. Step 4: Create a private endpoint to support SSO for web browser access: Note: This step is required when front-end Private Link is enabled, and client networks cannot access the public internet. After creating your Azure Databricks workspace, if you try to launch it without the proper Private Link configuration, you will see an error like the image below: This happens because the workspace is configured to block public network access, and the necessary Private Endpoints (including the browser_authentication endpoint for SSO) are not yet in place. Create Web-Auth Workspace Note: Deploy a “dummy”: WEB_AUTH_DO_NOT_DELETE_<region> workspace in the same region as your production workspace. Purpose: Host the browser_authentication private endpoint (one required per region). Lock the workspace (Delete lock) to prevent accidental removal. Follow step 2 to create Virtual Network (Vnet) Follow step 3 and create a VNet injected “dummy” workspace. Create Browser Authentication Private Endpoint In Azure Portal, Databricks workspace (dummy), Networking, Private endpoint connections, + Private endpoint. Resource step: Target sub-resource: browser_authentication Virtual Network step: VNet: Transit/Hub VNet (central network for Private Link) Subnet: Private Endpoint subnet in that VNet (not Databricks host subnets) DNS step: Integrate with Private DNS zone: Yes Zone: privatelink.azuredatabricks.net Ensure DNS zone is linked to the Transit VNet After creation: A-records for *.pl-auth.azuredatabricks.net are auto-created in the DNS zone. Workspace Connectivity Testing If you have VPN or ExpressRoute, Bastion is not required. However, for the purposes of this article we will be testing our workpace connectivity through Bastion. If you don’t have private connectivity and need to test from inside the VNet, Azure Bastion is a convenient option. Step 5: Create Storage Account From your Resource Group, click Create and select Storage account. On the configuration page: Select Preferred Storage type as: Azure Blob Storage or Azure Data Lake Storage Gen 2. Choose Performance and Redundancy options based on your business requirements. Click Next to proceed. Under the Advanced tab: Enable Hierarchical namespace under Data Lake Storage Gen2. This is critical for: Directory and file-level operations, Access Control Lists (ACLs). Under the Networking tab: Set Public Network Access to Disabled. Complete the creation process and then create container(s) inside the storage account. Step 6: Create Private Endpoints for Workspace Storage Account Pre-requisite: You need to create two private endpoints from the VNet used for VNet injection to your workspace storage account for the following Target sub-resources: dfs and blob. Navigate to your Storage Account. Go to Networking, Private Endpoints tab and click on to + Create Private Endpoint. In the Create Private Endpoint wizard: Resource tab: Select your Storage Account. Set Target sub-resource to dfs for the first endpoint. Virtual Network tab: Choose the VNet you used for VNet injection. Select the appropriate subnet. Complete the creation process. The private endpoint will be auto approved and visible under Private Endpoints. Repeat the process for the second private endpoint: This time set Target sub-resource to blob. Step 7: Link Storage and Databricks Workspace: Create Access Connector In your Resource Group, create an Access Connector for Azure Databricks. No additional configuration is required during creation. Assign Role to Access Connector Navigate to your Storage Account, Access Control (IAM), Add role assignment. Select: Role: Storage Blob Data Contributor Assign access to: Managed Identity Under Members: Click Select members. Find and select your newly created Access Connector for Azure Databricks. Save the role assignment. Copy Resource ID Go to the Access Connector Overview page. Copy the Resource ID for later use in Databricks configuration. Step 8: Link Storage and Databricks Workspace: Navigate to Unity Catalog In your Databricks Workspace, go to Unity Catalog, External Data and select “Create external Location” button. Configure External Location Select ADLS as the storage type. Enter the ADLS storage URL in the following format: abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/ Update these two parameters: <container_name> and <storage_name> Provide Access Connector Select “Create new storage credential” from Storage credential field. Paste the Resource ID of the Access Connector for Azure Databricks (from Step 10) into the Access Connector ID field. Validate Connection Click Submit. You should see a “Successful” message confirming the connection. Click submit and you should receive a “Successful” message, indicating your connection has succeeded. You can now create Catalogs and link your secure storage. Step 9: Configuring Serverless Compute Networking: If your organization plans to use Serverless SQL Warehouses or Serverless Jobs Compute, you must configure Serverless Networking. Add Network Connectivity Configuration (NCC) Go to the Databricks Account Console: https://accounts.azuredatabricks.net/ Navigate to Cloud resources, click Add Network Connectivity Configuration. Fill in the required fields and create a new NCC. Associate NCC with Workspace In the Account Console, go to Workspaces. Select your workspace, click Update Workspace. From the Network Connectivity Configuration dropdown, select the NCC you just created. Add Private Endpoint Rule In Cloud resources, select your NCC, select Private Endpoint Rules and click Add Private Endpoint Rule. Provide: Resource ID: Enter your Storage Account Resource ID. Note: this can be found from your storage account, click on “JSON View” top right. Azure Subresource type: dfs & blob. Approve Pending Connection Go to your Storage Account, Networking, Private Endpoints. You will see a Pending connection from Databricks. Approve the connection and you will see the Connection status in your Account Console as ESTABLISHED. Step 10: Test Your Workspace: Launch a small test cluster and verify the following: It can start (which means it can talk to the control plane). It can read/write from the storage, following the following code to confirm read/write to storage: Set Spark properties to configure Azure credentials to access Azure storage. Check Private DNS Record has been created. (Optional) If on-prem data is needed: try connecting to an on-prem database (using the ExpressRoute path): Connect your Azure Databricks workspace to your on-premises network - Azure Databricks | Microsoft Learn. Step 11: Account Console, Planning Workspace Access Controls and Getting Started: Once your Azure Databricks workspace is deployed, it's essential to configure access controls and begin onboarding users with the right permissions. From your account console: https://accounts.azuredatabricks.net/, you can centrally manage your environment: add users and groups, enable preview features, and view or configure all your workspaces. Azure Databricks supports fine-grained access management through Unity Catalog, cluster policies, and workspace-level roles. Start by defining who needs access to what—whether it's notebooks, tables, jobs, or clusters—and apply least-privilege principles to minimize risk. DBFS Limitation: DBFS is automatically created upon Databricks Workspace creation. DBFS can be found in your Managed Resource Group. Databricks cannot secure DBFS (see reference image below). If there is a business need to avoid DBFS then you can disable DBFS access following instructions here: Disable access to DBFS root and mounts in your existing Azure Databricks workspace. Use Unity Catalog to manage data access across catalogs, schemas, and tables, and consider implementing cluster policies to standardize compute configurations across teams. To help your teams get started, Microsoft provides a range of tutorials and best practice guides: Best practice articles - Azure Databricks | Microsoft Learn. Step 12: Planning Data Migration: As you prepare to move data into your Azure Databricks environment, it's important to assess your migration strategy early. This includes identifying source systems, estimating data volumes, and determining the appropriate ingestion methods—whether batch, streaming, or hybrid. For organizations with complex migration needs or legacy systems, Microsoft offers specialized support through its internal Azure Cloud Accelerated Factory program. Reach out to your Microsoft account team to explore nomination for Azure Cloud Accelerated Factory, which provides hands-on guidance, tooling, and best practices to accelerate and streamline your data migration journey. Summary Regular maintenance and governance are as important as the initial design. Continuously review the environment and update configurations as needed to address evolving requirements and threats. For example, tag all resources (workspaces, VNets, clusters, etc.) with clear identifiers (workspace name, environment, department) to track costs and ownership effectively. Additionally, enforce least privilege across the platform: ensure that only necessary users are given admin privileges, and use cluster-level access control to restrict who can create or start clusters. By following the above steps, an organization will have an Azure Databricks architecture that is securely isolated, well-governed, and scalable. References: [1] 5 Best Practices for Databricks Workspaces AzureDatabricksBestPractices/toc.md at master · Azure ... - GitHub Deploy a workspace using the Azure Portal Additional Links: Quick Introduction to Databricks: what is databricks | introduction - databricks for dummies Connect Purview with Azure Databricks: Integrating Microsoft Purview with Azure Databricks Secure Databricks Delta Share between Workspaces: Secure Databricks Delta Share for Serverless Compute Azure-Databricks Cost Optimization Guide: Databricks Cost Optimization: A Practical Guide Integrate Azure Databricks with Microsoft Fabric: Integrating Azure Databricks with Microsoft Fabric Databricks Solution Accelerators for Data & AI Azure updates Appendix 3.5 Understanding Data Transfer (Express Route vs. Public Internet) For data transfers, your organization must decide to use ExpressRoute or Internet Egress. There are several considerations that can help you determine your choice: 3.5.1. Connectivity Model • ExpressRoute: Provides a private, dedicated connection between your on-premises infrastructure and Microsoft Azure. It bypasses the public internet entirely and connects through a network service provider. • Internet Egress: Refers to outbound data traffic from Azure to the public internet. This is the default path for most Azure services unless configured otherwise. 3.6 Planning for User-Defined Routes (UDRs) When working with Databricks deployments—especially in VNet-injected workspaces—setting up User Defined Routes (UDRs) is a smart move. It’s a best practice that helps manage and secure network traffic more effectively. By using UDRs, teams can steer traffic between Databricks components and external services in a controlled way, which not only boosts security but also supports compliance efforts. 3.6.1 UDRs and Hub and Spoke Topology If your Databricks workspace is deployed into your own virtual network (VNet), you’ll need to configure standard user-defined routes (UDRs) to manage traffic flow. In a typical hub-and-spoke architecture, UDRs are used to route all traffic from the spoke VNets to the hub VNet. 3.6.2 Hub and Spoke with VWANHUB If your Databricks workspace is deployed into your own virtual network (VNet) and is peered to a Virtual WAN (VWAN) hub as the primary connectivity hub into Azure, a user-defined route (UDR) is not required—provided that a private traffic routing policy or internet traffic routing policy is configured in the VWAN hub. 3.6.3 Use of NVAs and Service Tags For Databricks traffic, you’ll need to assign a UDR to the Databricks-managed VNet with a next hop type of Network Virtual Appliance (NVA)—this could be an Azure Firewall, NAT Gateway, or another routing device. For control plane traffic, Databricks recommends using Azure service tags, which are logical groupings of IP addresses for Azure services and should be routed with the next hop type of internet. This is important because Azure IP ranges can change frequently as new resources are provisioned, and manually maintaining IP lists is not practical. Using service tags ensures that your routing rules automatically stay up to date. 3.6.4 Default Outbound Access Retirement (Non-Serverless Compute) Microsoft is retiring default outbound internet access for new deployments starting September 30,2025. Going forward, outbound connectivity will require an explicit configuration using an NVA, NAT Gateway, Load Balancer, or Public IP address. Also, note that using a Public IP Address in the deployment is discouraged for Security purposes, and it is recommended to deploy the workspace in a ‘Secure Cluster Connectivity ration.” Configure connectivity will require an explicit configuration using an NVA, NAT Gateway, Load Balancer, or Public IP address. Also, note that using a Public IP Address in the deployment is discouraged for Security purposes, and it is recommended to deploy the workspace in a ‘Secure Cluster Connectivity ration.”402Views0likes0CommentsTypical Storage access issues troubleshooting
We get a big number of cases with Storage Account connection failing and sometimes we see that our customers are not aware of the troubleshooting steps they can take to accelerate the resolution of this issue. As such, we've compiled some scenarios and the usual troubleshooting steps we ask you to take. Always remember that if you have done changes to your infrastructure, consider rolling them back to ensure that this is not the root cause. Even a small change that apparently has no effect, may cause downtime on your application. Common messages The errors that are shown in the portal when the Storage Account connectivity is down are very similar, and they may not indicate correctly the cause. Error Message that surfaces in the Portal for Logic Apps Standard System.Private.Core.Lib: Access to the path 'C:\home\site\wwwroot\host.json' is denied Cannot reach host runtime. Error details, Code: 'BadRequest', Message: 'Encountered an error (InternalServerError) from host runtime.' System.Private.CoreLib: The format of the specified network name is invalid. : 'C:\\home\\site\\wwwroot\\host.json'.' System.Private.CoreLib: The user name or password is incorrect. : 'C:\home\site\wwwroot\host.json'. Microsoft.Windows.Azure.ResourceStack: The SSL connection could not be established, see inner exception. System.Net.Http: The SSL connection could not be established, see inner exception. System.Net.Security: Authentication failed because the remote party has closed the transport stream Unexpected error occurred while loading workflow content and artifacts The errors don't really indicate what the root cause is, but it's very common to be a broken connection with the Storage. What to verify? There are 4 major components to verify in these cases: Logic App environment variables and network settings Storage Account networking settings Network settings DNS settings Logic App environment variables and Network From an App Settings point of view, there is not much to verify, but these are important steps, that sometimes are overlook. At this time, all or nearly all Logic Apps have been migrated to dotnet Functions_Worker_Runtime (under Environmental Variables tab), but this is good to confirm. It's also good to confirm if your Platform setting is set to 64 bits (under Configuration tab/ General Settings). We've seen that some deployments are using old templates and setting this as 32 bits, which doesn't make full use of the available resources. Check if Logic App has the following environment variables with value: WEBSITE_CONTENTOVERVNET - set to 1 OR WEBSITE_VNET_ROUTE_ALL - set to 1. OR vnetRouteAllEnabled set to 1. Configure virtual network integration with application and configuration routing. - Azure App Service | Microsoft Learn These settings can also be replaced with the UI setting in the Virtual Network tab, when you select "Content Storage" in the Configuration routing. For better understanding, vnetContentShareEnabled takes precedence. In other words, if it is set (true/false), WEBSITE_CONTENTOVERVNET is ignored. Only if vnetContentShareEnabled is null, WEBSITE_CONTENTOVERVNET is taken into account. Also keep this in mind: Storage considerations for Azure Functions | Microsoft Learn WEBSITE_CONTENTAZUREFILECONNECTIONSTRING and AzureWebJobsStorage have the connection string as in the Storage Account Website_contentazurefileconnectionstring | App settings reference for Azure Functions | Microsoft Learn Azurewebjobsstorage | App settings reference for Azure Functions | Microsoft Learn WEBSITE_CONTENTSHARE has the Fileshare name Website_contentshare | App settings reference for Azure Functions | Microsoft Learn These are the first points to validate. Storage Account settings If all these are matching/properly configured and still the Logic App is in error, we move to the next step, that is to validate the Storage Account network settings. When the Storage Account does not have Vnet integration enabled, there should be no issues, because the connection is made through the public endpoints. Still, even with this, you must ensure that at least the "Allow storage account key access" is enabled. This is because at this time, the Logic App is dependent on the Access key to connect to the Storage Account. Although you can set the AzureWebJobsStorage to run with Managed Identity, you can't fully disable storage account key access for Standard logic apps that use the Workflow Service Plan hosting option. However, with ASE v3 hosting option, you can disable storage account key access after you finish the steps to set up managed identity authentication. Create example Standard workflow in Azure portal - Azure Logic Apps | Microsoft Learn If this setting is enabled, you must check if Storage Account is behind Firewall. The Access may be Enabled for select networks or fully disabled. Both options require Service Endpoints or Private Endpoints configured. Deploying Standard Logic App to Storage Account behind Firewall using Service or Private Endpoints | Microsoft Community Hub So check the Networking tab under the Storage Account and confirm the following: In case you select the "selected networks" option, confirm that the VNET is the same as the Logic App is extended to. Your Logic App and Storage may be hosted in different Vnets, but you must ensure that there is full connectivity between them. They must be peered and with HTTPS and SMB traffic allowed (more explained in the Network section). You can select "Disabled" network access as well. You should also confirm that the Fileshare is created. Usually this is created automatically with the creation of the Logic App, but if you use Terraform or ARM, it may not create the file share and you must do it manually. Confirm if all 4 Private Endpoints are created and approved (File, Table, Queue and Blob). All these resources are used for different components of the Logic App. This is not fully documented, as it is internal engine documentation and not publicly available. For Azure Functions, the runtime base, this is partially documented, as you can read in the article: Storage considerations for Azure Functions | Microsoft Learn If a Private Endpoint is missing, create it and link it to the Vnet as Shared Resource. Not having all Private Endpoints created may end in runtime errors, connections errors or trigger failures. For example, if a workflow is not generating the URL even if it saves correctly, it may be the Table and Queue Private Endpoints missing, as we've seen many times with customers. You can read a bit more about the integration of the Logic App and a firewall secured Storage Account and the needed configuration in these articles: Secure traffic between Standard workflows and virtual networks - Azure Logic Apps | Microsoft Learn Deploy Standard logic apps to private storage accounts - Azure Logic Apps | Microsoft Learn You can use the Kudu console (Advanced tools tab) to further troubleshoot the connection with the Storage Account by using some network troubleshooting commands. If the Kudu console is not available, we recommend using a VM in the same Vnet as the Logic App, to mimic the scenario. Nslookup [hostname or IP] [DNS HOST IP] TCPPing [hostname or IP]:[PORT] Test-Netconnection [hostname] -port [PORT] If you have Custom DNS, the command NSLookup will not return the results from your DNS unless you specify the IP address as a parameter. Instead, you can use the nameresolver command for this, which will use the Vnet DNS settings to check for the endpoint name resolution. nameresolver [endpoint hostname or IP address] Networking Related Commands for Azure App Services | Microsoft Community Hub Vnet configuration Having configured the Private Endpoint for the Logic App will not affect traffic to the Storage. This is because the PE is only for Inbound traffic. The Storage Communication will the considered as outbound traffic, as it's the Logic App that actively communicates with the Storage. Secure traffic between Standard workflows and virtual networks - Azure Logic Apps | Microsoft Learn So consider that the link between these resources must not be interrupted. This forces you to understand that the Logic App uses both HTTPS and SMB protocols to communicate with the Storage Account, meaning that traffic under the ports 443 and 445 needs to be fully allowed in your Vnet. If you have a Network Security Group associated with the Logic App subnet, you need to confirm that the rules are allowing this traffic. You may need to explicitly create rules to allow this. Source port Destination port Source Destination Protocol Purpose * 443 Subnet integrated with Standard logic app Storage account TCP Storage account * 445 Subnet integrated with Standard logic app Storage account TCP Server Message Block (SMB) File Share In case you have forced routing to your Network Virtual Appliance (i.e. Firewall), you must also ensure that this resource is not filtering the traffic or blocking it. Having TLS inspection enabled in your Firewall must also be disabled, for the Logic App traffic. In short, this is because the Firewall will replace the certificate in the message, thus making the Logic App not recognizing the returned certificate, invalidating the message. You can read more about TLS inspection in this URL: Azure Firewall Premium features | Microsoft Learn DNS If you are using Azure DNS, this section should not apply, because all records are automatically created once you create the resources, but if you're using a Custom DNS, when you create the Azure resource (ex: Storage Private Endpoint), the IP address won't be registered in your DNS, so you must do it manually. You must ensure that all A Records are created and maintained, also keeping in mind that they need to point to the correct IP and name. If there are mismatches, you may see the communications severed between the Logic App and other resources, such as the Storage Account. So double-check all DNS records, and confirm that all is in proper state and place. And to make it even easier, with the help of my colleague Mohammed_Barqawi , this information is now translated into a easy to understand flowchart. If you continue to have issues after all these steps are verified, I suggest you open a case with us, so that we can validate what else may be happening, because either a step may have been missed, or some other issue may be occurring.Understand New Sentinel Pricing Model with Sentinel Data Lake Tier
Introduction on Sentinel and its New Pricing Model Microsoft Sentinel is a cloud-native Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) platform that collects, analyzes, and correlates security data from across your environment to detect threats and automate response. Traditionally, Sentinel stored all ingested data in the Analytics tier (Log Analytics workspace), which is powerful but expensive for high-volume logs. To reduce cost and enable customers to retain all security data without compromise, Microsoft introduced a new dual-tier pricing model consisting of the Analytics tier and the Data Lake tier. The Analytics tier continues to support fast, real-time querying and analytics for core security scenarios, while the new Data Lake tier provides very low-cost storage for long-term retention and high-volume datasets. Customers can now choose where each data type lands—analytics for high-value detections and investigations, and data lake for large or archival types—allowing organizations to significantly lower cost while still retaining all their security data for analytics, compliance, and hunting. Please flow diagram depicts new sentinel pricing model: Now let's understand this new pricing model with below scenarios: Scenario 1A (PAY GO) Scenario 1B (Usage Commitment) Scenario 2 (Data Lake Tier Only) Scenario 1A (PAY GO) Requirement Suppose you need to ingest 10 GB of data per day, and you must retain that data for 2 years. However, you will only frequently use, query, and analyze the data for the first 6 months. Solution To optimize cost, you can ingest the data into the Analytics tier and retain it there for the first 6 months, where active querying and investigation happen. After that period, the remaining 18 months of retention can be shifted to the Data Lake tier, which provides low-cost storage for compliance and auditing needs. But you will be charged separately for data lake tier querying and analytics which depicted as Compute (D) in pricing flow diagram. Pricing Flow / Notes The first 10 GB/day ingested into the Analytics tier is free for 31 days under the Analytics logs plan. All data ingested into the Analytics tier is automatically mirrored to the Data Lake tier at no additional ingestion or retention cost. For the first 6 months, you pay only for Analytics tier ingestion and retention, excluding any free capacity. For the next 18 months, you pay only for Data Lake tier retention, which is significantly cheaper. Azure Pricing Calculator Equivalent Assuming no data is queried or analyzed during the 18-month Data Lake tier retention period: Although the Analytics tier retention is set to 6 months, the first 3 months of retention fall under the free retention limit, so retention charges apply only for the remaining 3 months of the analytics retention window. Azure pricing calculator will adjust accordingly. Scenario 1B (Usage Commitment) Now, suppose you are ingesting 100 GB per day. If you follow the same pay-as-you-go pricing model described above, your estimated cost would be approximately $15,204 per month. However, you can reduce this cost by choosing a Commitment Tier, where Analytics tier ingestion is billed at a discounted rate. Note that the discount applies only to Analytics tier ingestion—it does not apply to Analytics tier retention costs or to any Data Lake tier–related charges. Please refer to the pricing flow and the equivalent pricing calculator results shown below. Monthly cost savings: $15,204 – $11,184 = $4,020 per month Now the question is: What happens if your usage reaches 150 GB per day? Will the additional 50 GB be billed at the Pay-As-You-Go rate? No. The entire 150 GB/day will still be billed at the discounted rate associated with the 100 GB/day commitment tier bucket. Azure Pricing Calculator Equivalent (100 GB/ Day) Azure Pricing Calculator Equivalent (150 GB/ Day) Scenario 2 (Data Lake Tier Only) Requirement Suppose you need to store certain audit or compliance logs amounting to 10 GB per day. These logs are not used for querying, analytics, or investigations on a regular basis, but must be retained for 2 years as per your organization’s compliance or forensic policies. Solution Since these logs are not actively analyzed, you should avoid ingesting them into the Analytics tier, which is more expensive and optimized for active querying. Instead, send them directly to the Data Lake tier, where they can be retained cost-effectively for future audit, compliance, or forensic needs. Pricing Flow Because the data is ingested directly into the Data Lake tier, you pay both ingestion and retention costs there for the entire 2-year period. If, at any point in the future, you need to perform advanced analytics, querying, or search, you will incur additional compute charges, based on actual usage. Even with occasional compute charges, the cost remains significantly lower than storing the same data in the Analytics tier. Realized Savings Scenario Cost per Month Scenario 1: 10 GB/day in Analytics tier $1,520.40 Scenario 2: 10 GB/day directly into Data Lake tier $202.20 (without compute) $257.20 (with sample compute price) Savings with no compute activity: $1,520.40 – $202.20 = $1,318.20 per month Savings with some compute activity (sample value): $1,520.40 – $257.20 = $1,263.20 per month Azure calculator equivalent without compute Azure calculator equivalent with Sample Compute Conclusion The combination of the Analytics tier and the Data Lake tier in Microsoft Sentinel enables organizations to optimize cost based on how their security data is used. High-value logs that require frequent querying, real-time analytics, and investigation can be stored in the Analytics tier, which provides powerful search performance and built-in detection capabilities. At the same time, large-volume or infrequently accessed logs—such as audit, compliance, or long-term retention data—can be directed to the Data Lake tier, which offers dramatically lower storage and ingestion costs. Because all Analytics tier data is automatically mirrored to the Data Lake tier at no extra cost, customers can use the Analytics tier only for the period they actively query data, and rely on the Data Lake tier for the remaining retention. This tiered model allows different scenarios—active investigation, archival storage, compliance retention, or large-scale telemetry ingestion—to be handled at the most cost-effective layer, ultimately delivering substantial savings without sacrificing visibility, retention, or future analytical capabilities.85Views0likes0CommentsAzure VPN issues - RasMan service - Unknown state
Azure VPN Client stopped working after KB5065813 update – RasMan service Unknown state Environment details: Azure VPN Client version: 4.0.4.0 (Microsoft Store) Windows version: Windows 11 Pro, 23H2 (fully updated) Issue Summary: The Azure VPN Client worked fine from Friday (14 November 2025) until Sunday (16 November 2025). On Sunday, Windows installed updates, and starting Monday morning (17 November 2025), Azure VPN stopped working. When I run the Prerequisites Test in Azure VPN Client, everything passes except: Rasman Windows Service Status Result: Unknown state. Please make sure Rasman windows service is in a running state. However, sRasMan service it's running Registry keys for RasMan exist, and Svchost\netsvcs includes RasMan What changed: I noticed that the update KB5065813 seems to be causing the issue. This KB is installed on my machine, but my colleagues have KB5041655 instead. As part of troubleshooting, I even changed my laptop to a brand-new device with a fresh OS install, and the issue persists after KB5065813 is applied. What I’ve tried: Restarted RasAuto and dependencies Reinstalled Azure VPN Client Installed older versions of Azure VPN Changed my laptop with a brand new one and a fresh OS install Questions: Is KB5065813 confirmed to break Azure VPN Client functionality? Is this related to the October 2025 RasMan vulnerability patch (CVE-2025-59230)? Any official Microsoft steps or scripts to fix this? Screenshots attached for reference.94Views0likes3Comments