best practices
1641 TopicsSecure Unique Default Hostnames Now GA for Functions and Logic Apps
We are pleased to announce that Secure Unique Default Hostnames are now generally available (GA) for Azure Functions and Logic Apps (Standard). This expands the security model previously available for Web Apps to the entire App Service ecosystem and provides customers with stronger, more secure, and standardized hostname behavior across all workloads. Why This Feature Matters Historically, App Service resources have used default hostname format such as: <SiteName>.azurewebsites.net While straightforward, this pattern introduced potential security risks, particularly in scenarios where DNS records were left behind after deleting a resource. In those situations, a different user could create a new resource with the same name and unintentionally receive traffic or bindings associated with the old DNS configuration, creating opportunities for issues such as subdomain takeover. Secure Unique Default Hostnames address this by assigning a unique, randomized, region‑scoped hostname to each resource, for example: <SiteName>-<Hash>.<Region>.azurewebsites.net This change ensures that: No other customer can recreate the same default hostname. Apps inherently avoid risks associated with dangling DNS entries. Customers gain a more secure, reliable baseline behavior across App Service. Adopting this model now helps organizations prepare for the long‑term direction of the platform while improving security posture today. What’s New: GA Support for Functions and Logic Apps With this release, both Azure Functions and Logic Apps (Standard) fully support the Secure Unique Default Hostname capability. This brings these services in line with Web Apps and ensures customers across all App Service workloads benefit from the same secure and consistent default hostname model. Azure CLI Support The Azure CLI for Web Apps and Function Apps now includes support for the “--domain-name-scope” parameter. This allows customers to explicitly specify the scope used when generating a unique default hostname during resource creation. Examples: az webapp create --domain-name-scope {NoReuse, ResourceGroupReuse, SubscriptionReuse, TenantReuse} az functionapp create --domain-name-scope {NoReuse, ResourceGroupReuse, SubscriptionReuse, TenantReuse} Including this parameter ensures that deployments consistently use the intended hostname scope and helps teams prepare their automation and provisioning workflows for the secure unique default hostname model. Why Customers Should Adopt This Now While existing resources will continue to function normally, customers are strongly encouraged to adopt Secure Unique Default Hostnames for all new deployments. Early adoption provides several important benefits: A significantly stronger security posture. Protection against dangling DNS and subdomain takeover scenarios. Consistent default hostname behavior as the platform evolves. Alignment with recommended deployment practices and modern DNS hygiene. This feature represents the current best practice for hostname management on App Service and adopting it now helps ensure that new deployments follow the most secure and consistent model available. Recommended Next Steps Enable Secure Unique Default Hostnames for all new Web Apps, Function Apps, and Logic Apps. Update automation and CLI scripts to include the --domain-name-scope parameter when creating new resources. Begin updating internal documentation and operational processes to reflect the new hostname pattern. Additional Resources For readers who want to explore the technical background and earlier announcements in more detail, the following articles offer deeper coverage of unique default hostnames: Public Preview: Creating Web App with a Unique Default Hostname This article introduces the foundational concepts behind unique default hostnames. It explains why the feature was created, how the hostname format works, and provides examples and guidance for enabling the feature when creating new resources. Secure Unique Default Hostnames: GA on App Service Web Apps and Public Preview on Functions This article provides the initial GA announcement for Web Apps and early preview details for Functions. It covers the security benefits, recommended usage patterns, and guidance on how to handle existing resources that were created without unique default hostnames. Conclusion Secure Unique Default Hostnames now provide a more secure and consistent default hostname model across Web Apps, Function Apps, and Logic Apps. This enhancement reduces DNS‑related risks and strengthens application security, and organizations are encouraged to adopt this feature as the standard for new deployments.38Views0likes0CommentsContext Engineering Lessons from Building Azure SRE Agent
We started with 100+ tools and 50+ specialized agents. We ended with 5 core tools and a handful of generalists. The agent got more reliable, not less. Every context decision is a tradeoff: latency vs autonomy, evidence-building vs speed, oversight - and the cost of being wrong. This post is a practical map of those knobs and how we adjusted them for SRE Agent.3.6KViews17likes2CommentsRemoving Teams from a Specific User Profile via Script in Intune
Hi, Working on a process with using PowerShell to remove Microsoft Teams from a specific user that is not the primary user of that computer on multiple Windows 11 devices using Microsoft Intune. However, I need a script to make this happen. Can I get some help from the community on this? Has anyone else seen this before? Thanks, ZC4614Views0likes0CommentsGuarding Kubernetes Deployments: Runtime Gating for Vulnerable Images Now Generally Available
Cloud-native development has made containerization vital, but it has also brought about new risks. In dynamic Kubernetes environments, a single vulnerable container image can open the door to an attack. Organizations need proactive controls to prevent unsafe workloads from running. Although security professionals recognize these risks, traditional security checks typically occur after deployment, relying on scans and alerts that only identify issues once workloads are already running, leaving teams scrambling to respond. Kubernetes runtime gating within Microsoft Defender for Cloud effectively addresses these challenges. Now generally available, gated deployment for Kubernetes container images introduces a proactive, automated checkpoint at the moment of deployment. Getting Started: Setting Up Kubernetes Gated Deployment The process starts with enabling the required components for gated deployment. When Security Gating is enabled, the defender admission controller pod is deployed to the Kubernetes cluster. Organizations can create rules for gated deployment which will define the criteria that container images must meet to be permitted to the cluster. With the admission controller and policies in place, the system is ready to evaluate deployment requests against the defined rules. How Kubernetes Gated Deployment Works Vulnerability Scanning Defender for Cloud performs agentless vulnerability scanning on container images stored in the registry. Scan results are saved as security artifacts in the registry, detailing each image’s vulnerabilities. Security artifacts are signed with Microsoft signature to verify authenticity. Deployment Evaluation During deployment, the admission controller reads both the stored security policies and vulnerability assessment artifacts. Each container image is evaluated against the organization’s defined policies. Enforcement Modes Audit Mode: Deployments are allowed, but any policy violations are logged for review. This helps teams refine policies without disrupting workflows. Deny Mode: Non-compliant images are blocked from deployment, ensuring only secure containers reach production. Practical Guidance: Using Gating to Advance DevSecOps Leveraging gated deployment requires thoughtful coordination between several teams, with security professionals working closely alongside platform, DevOps, and application teams to define policies, enforce risk thresholds, and ensure compliance throughout the deployment process. To maximize the effectiveness of gated deployment, organizations should take a strategic approach to policy enforcement. Work with platform teams to define risk thresholds and deploy in audit mode during rollout - then move to deny mode when ready. Continuously tune policies based on audit logs and incident findings to adapt to new threats and business requirements. Educate DevOps and application teams on policy requirements and violation remediation, fostering a culture of shared responsibility. Consider best practices for rule design. Use Cases and Real-World Examples Gated deployment is designed to meet the diverse needs of modern enterprises. Here are several use cases that illustrate its' effectiveness in protecting workloads and streamlining cloud operations: Ensuring Compliance in Regulated Industries: Organizations in sectors like finance, healthcare, and government often have strict compliance mandates (e.g. no use of software with known critical vulnerabilities). Gated deployment provides an automated way to enforce these mandates. For example, a bank can define rules to block any container image that has a critical vulnerability or that lacks the required security scan metadata. The admission controller will automatically prevent non-compliant deployments, ensuring the production environment is continuously compliant with the bank’s security policy. This not only reduces the risk of costly security incidents but also creates an audit trail of compliance – every blocked deployment is logged, which can be shown to auditors as proof that proactive controls are in place. In short, gated deployment helps organizations maintain compliance as they deploy cloud-native applications. Reducing Risk in Multi-Team DevOps Environments: In large enterprises with multiple development teams pushing code to shared Kubernetes clusters, it can be challenging to enforce consistent security standards. Gated deployment acts as a safety net across all teams. Imagine a scenario with dozens of microservices and dev teams: even if one team attempts to deploy an outdated base image with known vulnerabilities, the gating feature will catch it. This is especially useful in multi-cloud setups – e.g., your company runs some workloads on Azure Kubernetes Service (AKS) and others on Elastic Kubernetes Service (EKS). With gated deployment in Defender for Cloud, you can apply the same security rules to both, and the system will uniformly block non-compliant images on Azure or Amazon Web Services (AWS) clusters alike. This consistency simplifies governance. It also fosters a DevSecOps culture: developers get immediate feedback if their deployment is flagged, which raises awareness of security requirements. Over time, teams learn to integrate security earlier (shifting left) to avoid tripping the gate. Yet, because you can start in audit mode, there is an educational grace period – developers see warnings in logs about policy violations before those violations cause deployment failures. This leads to collaborative remediation rather than abrupt disruption. Protecting Against Known Threats in Production: Zero-day vulnerabilities in popular containers (like database images or open-source services) are regularly discovered. Organizations often scramble to patch or update once a new CVE is announced. Gated deployment can serve as an automatic shield against known issues. For instance, if a critical CVE in Nginx is published, any container image still carrying that vulnerability would be denied at deployment until it is patched. If an attacker attempts to deploy a backdoored container image in your environment, the admission rules can stop it if it does not meet the security criteria. In this way, gating provides a form of runtime admission control that complements runtime threat detection: rather than detecting malicious activity after a container is running, it tries to prevent potentially unsafe containers from ever running at all. Streamlining Cloud Deployment Workflows with Security Built-In: Enterprises embracing cloud-native development want to move fast but safely. Gated deployment lets security teams define guardrails, and then developers can operate within those guardrails without constant oversight. For example, a company can set a policy “all images must be scanned and free of critical vulnerabilities before deployment.” Once that rule is in place, developers simply get an error if they try to deploy something out-of-bounds – they know to go back and fix it and then redeploy. This removes the need for manual ticketing or approvals for each deployment; the system itself enforces the policy. That increases operational efficiency and ensures a consistent baseline of security across all services. Gated deployment operationalizes the concept of “secure by default” for Kubernetes workloads: every deployment is vetted, with no extra steps required by end-users beyond what they normally do. oyment. Part of a Broader Security Strategy Kubernetes gated deployment is a key piece of Microsoft’s larger vision for container security and secure supply chain at large. While runtime gating is a powerful tool on its own, its' value multiplies when seen as part of Microsoft Defender for Cloud’s holistic container security offering. It complements and enhances the other security layers that are available for containerized applications, covering the full lifecycle of container workloads from development to runtime. Let’s put gated deployment in context of this broader story: During development and build phases, Defender for Cloud offers tools like CI/CD pipeline scanning (for example, a CLI that scans images during the build process). Agentless discovery, inventory and continuous monitoring of cloud resources to detect misconfigurations, contextual risk assessment, enhanced risk hunting and more. Continuous agentless vulnerability scanning takes place at both the registry and runtime level. Runtime Gating prevents those known issues from ever running and logs all non-compliant attempts at deployment. Threat Detection surfaces anomalies or malicious activities by monitoring Kubernetes audit logs and live workloads. Using integration with Defender XDR, organizations can further investigate these threats or implement response actions. Conclusion: Raising the Bar for Multi-Cloud Container Security With Kubernetes Gating now generally available in Defender for Cloud, technical leaders and security teams can audit or block vulnerable containers across any cloud platform. Integrating automated controls and best practices improves compliance and reduces risk within cloud-native environments. This strengthens Kubernetes clusters by preventing unsafe deployments, ensuring ongoing compliance, and supporting innovation without sacrificing security. Runtime gating helps teams balance rapid delivery with robust protection. Additional Resources to Learn More: Release Notes Overview of Gated Deployment Enable Gated Deployment Troubleshooting FAQ Test Gated Deployment in Your Own Environment Reviewers: Maya Herskovic, Principal Product Manager Dolev Tsuberi, Senior Software EngineerTeams, SharePoint, Viva Engage - which to use for dept comms?
I lead a 300‑person department, and I’m looking for guidance on the best Microsoft tool(s) to improve employee engagement and awareness across our organization. Right now, most of our communication happens in Microsoft Teams (channel posts, group chats), but we have dozens of channels and each group has its own space, which makes it hard to share department‑wide news, cascade annual strategic initiatives, report monthly progress, and celebrate wins or employee recognition. We don’t currently have any SharePoint sites built out, and we’re not using Viva Engage or communities. For those who’ve tackled similar challenges: Which Microsoft tools or combinations have worked best for broad communication, engagement, and consistent messaging across a large department? I’d love to hear what’s been effective for you. Thanks in advance for any insights!62Views0likes3CommentsA CISO's Guide to Securing AI - Securing AI for Federal, DIB, and DoW Entities
Artificial Intelligence (AI) is rapidly reshaping federal missions, defense operations, and critical infrastructure. From intelligence analysis to logistics and cyber defense, AI’s transformative power is undeniable. Yet, with great power comes great responsibility and risk.564Views0likes0CommentsFind the Alerts You Didn't Know You Were Missing with Azure SRE Agent
I had 6 alert rules. CPU. Memory. Pod restarts. Container errors. OOMKilled. Job failures. I thought I was covered. Then my app went down. I kept refreshing the Azure portal, waiting for an alert. Nothing. That's when it hit me: my alerts were working perfectly. They just weren't designed for this failure mode. Sound familiar? The Problem Every Developer Knows If you're a developer or DevOps engineer, you've been here: a customer reports an issue, you scramble to check your monitoring, and then you realize you don't have the right alerts set up. By the time you find out, it's already too late. You set up what seems like reasonable alerting and assume you're covered. But real-world failures are sneaky. They slip through the cracks of your carefully planned thresholds. My Setup: AKS with Redis I love to vibe code apps using GitHub Copilot Agent mode with Claude Opus 4.5. It's fast, it understands context, and it lets me focus on building rather than boilerplate. For this project, I built a simple journal entry app: AKS cluster hosting the web API Azure Cache for Redis storing journal data Azure Monitor alerts for CPU, memory, pod restarts, container errors, OOMKilled, and job failures Seemed solid. What could go wrong? The Scenario: Redis Password Rotation Here's something that happens constantly in enterprise environments: the security team rotates passwords. It's best practice. It's in the compliance checklist. And it breaks things when apps don't pick up the new credentials. I simulated exactly this. The pods came back up. But they couldn't connect to Redis (as expected). The readiness probes started failing. The LoadBalancer had no healthy backends. The endpoint timed out. And not a single alert fired. Using SRE Agent to Find the Alert Gaps Instead of manually auditing every alert rule and trying to figure out what I missed, I turned to Azure SRE Agent. I asked it a simple question: "My endpoint is timing out. What alerts do I have, and why didn't any of them fire?" Within minutes, it had diagnosed the problem. Here's what it found: My Existing Alerts Why They Didn't Fire High CPU/Memory No resource pressure,just auth failures Pod Restarts Pods weren't restarting, just unhealthy Container Errors App logs weren't being written OOMKilled No memory issues Job Failures No K8s jobs involved The gaps SRE Agent identified: ❌ No synthetic URL availability test ❌ No readiness/liveness probe failure alerts ❌ No "pods not ready" alerts scoped to my namespace ❌ No Redis connection error detection ❌ No ingress 5xx/timeout spike alerts ❌ No per-pod resource alerts (only node-level) SRE Agent didn't just tell me what was wrong, it created a GitHub issue with : KQL queries to detect each failure type Bicep code snippets for new alert rules Remediation suggestions for the app code Exact file paths in my repo to update Check it out: GitHub Issue How I Built It: Step by Step Let me walk you through exactly how I set this up inside SRE Agent. Step 1: Create an SRE Agent I created a new SRE Agent in the Azure portal. Since this workflow analyzes alerts across my subscription (not just one resource group), I didn't configure any specific resource groups. Instead, I gave the agent's managed identity Reader permissions on my entire subscription. This lets it discover resources, list alert rules, and query Log Analytics across all my resource groups. Step 2: Connect GitHub to SRE Agent via MCP I added a GitHub MCP server to give the agent access to my source code repository.MCP (Model Context Protocol) lets you bring any API into the agent. If your tool has an API, you can connect it. I use GitHub for both source code and tracking dev tickets, but you can connect to wherever your code lives (GitLab, Azure DevOps) or your ticketing system (Jira, ServiceNow, PagerDuty). Step 3: Create a Subagent inside SRE Agent for managing Azure Monitor Alerts I created a focused subagent with a specific job and only the tools it needs: Azure Monitor Alerts Expert Prompt: " You are expert in managing operations related to azure monitor alerts on azure resources including discovering alert rules configured on azure resources, creating new alert rules (with user approval and authorization only), processing the alerts fired on azure resources and identifying gaps in the alert rules. You can get the resource details from azure monitor alert if triggered via alert. If not, you need to ask user for the specific resource to perform analysis on. You can use az cli tool to diagnose logs, check the app health metrics. You must use the app code and infra code (bicep files) files you have access to in the github repo <insert your repo> to further understand the possible diagnoses and suggest remediations. Once analysis is done, you must create a github issue with details of analysis and suggested remediation to the source code files in the same repo." Tools enabled: az cli – List resources, alert rules, action groups Log Analytics workspace querying – Run KQL queries for diagnostics GitHub MCP – Search repositories, read file contents, create issues Step 4: Ask the Subagent About Alert Gaps I gave the agent context and asked a simple question: "@AzureAlertExpert: My API endpoint http://132.196.167.102/api/journals/john is timing out. What alerts do I have configured in rg-aks-journal, and why didn't any of them fire? The agent did the analysis autonomously and summarized findings with suggestions to add new alert rules in a GitHub issue. Here's the agentic workflow to perform azure monitor alert operations Why This Matters Faster response times. Issues get diagnosed in minutes, not hours of manual investigation. Consistent analysis. No more "I thought we had an alert for that" moments. The agent systematically checks what's covered and what's not. Proactive coverage. You don't have to wait for an incident to find gaps. Ask the agent to review your alerts before something breaks. The Bottom Line Your alerts have gaps. You just don't know it until something slips through. I had 6 alert rules and still missed a basic failure. My pods weren't restarting, they were just unhealthy. My CPU wasn't spiking, the app was just returning errors. None of my alerts were designed for this. You don't need to audit every alert rule manually. Give SRE Agent your environment, describe the failure, and let it tell you what's missing. Stop discovering alert gaps from customer complaints. Start finding them before they matter. A Few Tips Give the agent Reader access at subscription level so it can discover all resources Use a focused subagent prompt, don't try to do everything in one agent Test your MCP connections before running workflows What Alert Gaps Have Burned You? What's the alert you wish you had set up before an incident? Credential rotation? Certificate expiry? DNS failures? Let us know in the comments.166Views0likes0CommentsData Security: Azure key Vault in Data bricks
Why this article? To remove the vulnerability of exposing the data base connection string in Databricks notebook directly, by using Azure key vault. Database connection strings are extremely confidential/vulnerable data, that we should not be exposed in the DataBricks notebook explicitly. Azure key vault is a secure option to read the secrets and establish connection. What do we need? Tenant Id of the app from the app registration with access to the azure key vault secrets Client Id of the of the app from the app registration with access to the azure key vault secrets Client secret of the app from the app registration with access to the azure key vault Where to find this information? Under the App registration, you can find the (application) Client Id, Directory (tenant) Id. Client secret value is found in the app registration of the service, under Manage -> Certificate & secrets. You can use an existing secret or create a new one and use it to access the key Vault secrets. Make sure the application is added with get access to read the secrets. Verify the key vault you are checking and using in Databricks is the same one with read access. You can verify this by going to the Azure key vault -> Access Policies and search for the application name. It should show up on search as below, this will confirm that the access of the application. What do we need to setup in Databricks notebook? Open your cluster and install azure.keyvault and azure-identity (installing version should be compatible with you cluster configuration, refer: https://docs.databricks.com/aws/en/libraries/package-repositories) In a new notebook, let’s start by importing the necessary modules. Your notebook would start with the modules, followed by tentatId, clientId, client secret, azure key vault URL , secretName of the connection string in the azure key vault and secretVersion. Lastly, we need to fetch the secret using the below code Vola, we have the DB connection string to perform the CRUD operations. Conclusion: By securely retrieving your database connection string from Azure Key Vault, you eliminate credential exposure and strengthen the overall security posture of your Databricks workflows. This simple shift ensures your notebooks remain clean, compliant, and production‑ready.