enterprise security
24 TopicsMicrosoft Sentinel data lake is now generally available
Security is being reengineered for the AI era, shifting from static controls to fast, platform-driven defense. Traditional tools, scattered data, and outdated systems struggle against modern threats. An AI-ready, data-first foundation is needed to unify telemetry, standardize agent access, and enable autonomous responses while ensuring humans are in command of strategy and high-impact investigations. Security teams already anchor their operations around SIEMs for comprehensive visibility. We're building on that foundation by evolving Microsoft Sentinel into both the SIEM and the platform for agentic defense—connecting analytics and context across ecosystems. Today, we’re introducing new platform capabilities that build on Sentinel data lake: Sentinel graph for deeper insight and context; an MCP server and tools to make data agent ready; new developer capabilities; and a Security Store for effortless discovery and deployment—so protection accelerates to machine speed while analysts do their best work. We’ve reached a major milestone in our journey to modernize security operations — Microsoft Sentinel data lake is now generally available. This fully managed, cloud-native data lake is redefining how security teams manage, analyze, and act on their data cost-effectively. Since its introduction, organizations across sectors are embracing Sentinel data lake for its transformative impact on security operations. Customers consistently highlight its ability to unify security data from diverse sources, enabling enhanced threat detection and investigation. Many cite cost efficiency as a key benefit, with tiered storage and flexible retention, helping reduce costs. With petabytes of data already ingested, users are gaining real-time and historical insights at scale. "With Microsoft Sentinel data lake integration, we now have a scalable and cost-efficient solution for retaining Microsoft Sentinel data for long-term retention. This empowers our security and compliance teams with seamless access to historical telemetry data right within the data lake explorer and Jupyter notebooks - enabling advanced threat hunting, forensic analysis, and AI-powered insights at scale" Farhan Nadeem, Senior Security Engineer Government of Nunavut Industry partners also commend its role in modernizing SOC workflows and accelerating AI-driven analytics. “Microsoft Sentinel data lake amplifies BlueVoyant’s ability to transform security operations into a mature, intelligence-driven discipline. It preserves institutional memory across years of telemetry, which empowers advanced threat hunting strategies that evolve with time. Security teams can validate which data sources yield actionable insights, uncover persistent attack patterns, and retain high-value indicators that support long-term strategic advantage.” Milan Patel, CRO BlueVoyant Microsoft Sentinel data lake use-cases There are many powerful ways customers are unlocking value with Sentinel data lake—here are just a few impactful examples.: Threat investigations over extended timelines: Security analysts query data older than 90 days to uncover slow-moving attacks—like brute-force and password spray campaigns—that span accounts and geographies. Behavioral baselining for deeper insights: SOC engineers build time-series models using months of sign-in logs to establish a standard of normal behavior and identify unusual patterns, such as credential abuse or lateral movement. Alert enrichment: SOC teams correlate alerts with Firewall and Netflow data, often stored only in the data lake, reducing false positives and increasing alert accuracy. Retrospective threat hunting with new indicators of compromise (IOCs): Threat intelligence teams react to emerging IOCs by running historical queries across the data lake, enabling rapid and informed response. ML-Powered insights: SOC engineers use Spark Notebooks to build and operationalize custom machine learning models for anomaly detection, alert enrichment, and predictive analytics. The Sentinel data lake is more than a storage solution—it’s the foundation for modern, AI-powered security operations. Whether you're scaling your SOC, building deeper analytics, or preparing for future threats, the Sentinel data lake is ready to support your journey. What’s new Regional expansion In light of strong customer demand in public preview, at GA we are expanding Sentinel data lake availability to additional regions. These new regions will roll out progressively over the coming weeks. For more information, see documentation. Flexible data ingestion and management With over 350 native connectors, SOC teams can seamlessly ingest both structured and semi-structured data at scale. Data is automatically mirrored from the analytics tier to the data lake tier, at no additional cost, ensuring a single, unified copy is available for diverse use cases across security operations. Since the public preview of Microsoft Sentinel data lake, we've launched 45 new connectors built on the scalable and performant Codeless Connector Framework (CCF), including connectors for: GCP: SQL, DNS, VPC Flow, Resource Manager, IAM, Apigee AWS: Security Hub findings, Route53 DNS Others: Alibaba Cloud, Oracle, Salesforce, Snowflake, Cisco Sentinel’s connector ecosystem is designed to help security teams seamlessly unify signals across hybrid environments, without the need for heavy engineering effort. Explore the full list of connectors in our documentation here. App Assure Microsoft Sentinel data lake promise As part of our commitment to customer success, we are expanding the App Assure Microsoft Sentinel promise to Sentinel data lake. This means customers can confidently onboard their data, knowing that App Assure stands ready to help resolve connector issues such as replacing deprecated APIs with updated ones, and accelerating new integrations. Whether you're working with existing Independent Software Vendor (ISV) solutions or building new ones, App Assure will collaborate directly with ISVs to ensure seamless data ingestion into the lake. This promise reinforces our dedication to delivering reliable, scalable, and secure security operations, backed by engineering support and a thriving partner ecosystem. Cost management and billing We are introducing new cost management features in public preview to help customers with cost predictability, billing transparency, and operational efficiency. Customers can set usage-based alerts on specific meters to monitor and control costs. For example, you can receive alerts when query or notebook usage passes set limits, helping avoid unexpected expenses and manage budgets. In-product reports provide customers with insights into usage trends over time, enabling them to identify cost drivers and optimize data retention and processing strategies. To support the ingestion and standardization of diverse data sources, we are introducing a new Data Processing feature that applies a $0.10 per GB charge for all data as it is ingested into the data lake. This feature enables a broad array of transformations like redaction, splitting, filtering and normalizing data. This feature was not billed during public preview but will be chargeable at GA starting October 1,2025. Data lake ingestion charges of $0.05 per GB will apply to Entra asset data; starting October 1, 2025. This was previously not billed during public preview. Retaining security data to perform deep analytics and investigations is crucial for defending against threats. To help enable customers to retain all their security data for extended periods cost effectively, data lake storage, including asset data storage, is now billed with a simple and uniform data compression rate of 6:1 across all data sources. Please refer to Plan costs and understand Microsoft Sentinel pricing and billing article for more information. For detailed prerequisites and instructions on configuring and managing asset connectors, refer to the official documentation: Asset data in Microsoft Sentinel data lake. KQL and Notebook enhancements We are introducing several enhancements to our data lake analytics capabilities with an upgraded KQL and notebook experience. Security teams can now run multi-workspace KQL queries for broader threat correlation and schedule KQL jobs more frequently. Frequent KQL jobs enable SOC teams to automate historical threat intelligence matching, summarize alert trends, and aggregate signals across workspaces. For example, schedule recurring jobs to scan for matches against newly ingested IOCs, helping uncover threats that were previously undetected and strengthening threat hunting and investigation workflows. The enhanced Jobs page offers operational clarity for SOC teams with a comprehensive view into job health and activity. At the top, a summary dashboard provides instant visibility into key metrics, total jobs, completions, and failures, helping teams quickly assess job health. A filterable list view displays essential details such as job names, status, frequency, and last run information, enabling quick prioritization and triage. For more detailed diagnostics, users can view individual jobs to access job runs telemetry such as job run duration, row count, and additional historical execution trends, providing additional visibility. Notebooks are receiving a significant upgrade, offering streamlined user experience for querying the data lake. Users now benefit from IntelliSense support for syntax and table names, making query authoring faster and more intuitive. They can also configure custom compute session timeouts and warning windows to better manage resources. Scheduling notebooks as jobs is now simpler, and users can leverage GitHub Copilot for intelligent assistance throughout the process. Together, these KQL and notebook improvements deliver deeper, more customizable analytics, helping customers unlock richer insights, accelerate threat response, and scale securely across diverse environments. Powering agentic defense Data centralization powers AI agents and automation to access comprehensive, historical, and real-time data for advanced analytics, anomaly detection, and autonomous threat response. Support for tools like KQL queries, Spark notebooks, and machine learning models in the data lake, allows agentic systems to continuously learn, adapt, and act on emerging threats. Integration with Security Copilot and MCP Server further enhances agentic defense, enabling smarter, faster, and context-rich security operations—all built on the foundation of Sentinel’s unified data lake. Microsoft Sentinel 50 GB commitment tier promotional pricing To make Microsoft Sentinel more accessible to small and mid-sized customers, we are introducing a new 50 GB commitment tier in public preview, with promotional pricing offered from October 1, 2025, to March 31, 2026. Customers who choose the 50 GB commitment tier during this period will maintain their promotional rate until March 31, 2027. This offer is available in all regions where Microsoft Sentinel is sold, with regional variations in promotional pricing. It is accessible through EA, CSP, and Direct channels. The new 50 GB commitment tier details will be available starting October 1, 2025, on the Microsoft Sentinel pricing page. Thank you to our customers and partners We’re incredibly grateful for the continued partnership and collaboration from our customers and partners throughout this journey. Your feedback and trust have been instrumental in shaping Microsoft Sentinel data lake into what it is today. Thank you for being a part of this critical milestone—we’re excited to keep building together. Get started today By centralizing data, optimizing costs, expanding coverage, and enabling deep analytics, Microsoft Sentinel empowers security teams to operate smarter, faster, and more effectively. Get started with Microsoft Sentinel data lake today in the Microsoft Defender experience. To learn more, see: Microsoft Sentinel—AI-Powered Cloud SIEM & Platform Pricing: Pricing page, Plan costs and understand Microsoft Sentinel pricing and billing Documentation: Connect Sentinel to Defender, Jupyter notebooks in Microsoft Sentinel data lake, KQL and the Microsoft Sentinel data lake, Permissions for Microsoft Sentinel data lake, Manage data tiers and retention in Microsoft Defender experience Blogs: Sentinel data lake FAQ blog, Empowering defenders in the era of AI, Microsoft Sentinel graph announcement, App Assure Microsoft Sentinel data lake promiseIntroducing developer solutions for Microsoft Sentinel platform
Security is being reengineered for the AI era, moving beyond static, rule-bound controls and toward after-the-fact response toward platform-led, machine-speed defense. The challenge is clear: fragmented tools, sprawling signals, and legacy architectures that can’t match the velocity and scale of modern attacks. What’s needed is an AI-ready, data-first foundation - one that turns telemetry into a security graph, standardizes access for agents, and coordinates autonomous actions while keeping humans in command of strategy and high-impact investigations. Security teams already center operations on their SIEM for end-to-end visibility, and we’re advancing that foundation by evolving Microsoft Sentinel into both the SIEM and the platform for agentic defense—connecting analytics and context across ecosystems. And today, we’re introducing new platform capabilities that build on Sentinel data lake: Sentinel graph for deeper insight and context; Sentinel MCP server and tools to make data agent ready; new developer capabilities; and Security Store for effortless discovery and deployment—so protection accelerates to machine speed while analysts do their best work. Today, customers use a breadth of solutions to keep themselves secure. Each solution typically ingests, processes, and stores the security data it needs which means applications maintain identical copies of the same underlying data. This is painful for both customers and partners, who don’t want to build and maintain duplicate infrastructure and create data silos that make it difficult to counter sophisticated attacks. With today’s announcement, we’re directly addressing those challenges by giving partners the ability to create solutions that can reason over the single copy of the security data that each customer has in their Sentinel data lake instance. Partners can create AI solutions that use Sentinel and Security Copilot and distribute them in Microsoft Security Store to reach audiences, grow their revenue, and keep their customers safe. Sentinel already has a rich partner ecosystem with hundreds of SIEM solutions that include connectors, playbooks, and other content types. These new platform capabilities extend those solutions, creating opportunities for partners to address new scenarios and bring those solutions to market quickly since they don’t need to build complex data pipelines or store and process new data sets in their own infrastructure. For example, partners can use Sentinel connectors to bring their own data into the Sentinel data lake. They can create Jupyter notebook jobs in the updated Sentinel Visual Studio Code extension to analyze that data or take advantage of the new Model Context Protocol (MCP) server which makes the data understandable and accessible to AI agents in Security Copilot. With Security Copilot’s new vibe-coding capabilities, partners can create their agent in the same Sentinel Visual Studio Code extension or the environment of their choice. The solution can then be packaged and published to the new Microsoft Security Store, which gives partners an opportunity to expand their audience and grow their revenue while protecting more customers across the ecosystem. These capabilities are being embraced across our ecosystem by mature and emerging partners alike. Services partners such as Accenture and ISVs such as Zscaler and ServiceNow are already creating solutions that leverage the capabilities of the Sentinel platform. Partners have already brought several solutions to market using the integrated capabilities of the Sentinel platform: Illumio. Illumio for Microsoft Sentinel combines Illumio Insights with Microsoft Sentinel data lake and Security Copilot to revolutionize detection and response to cyber threats. It fuses data from Illumio and all the other sources feeding into Sentinel to deliver a unified view of threats, giving SOC analysts, incident responders, and threat hunters visibility and AI-driven breach containment capabilities for lateral traffic threats and attack paths across hybrid and multi-cloud environments. To learn more, visit Illumio for Microsoft Sentinel. OneTrust. OneTrust’s AI-ready governance platform enables 14,000 customers globally – including over half of the Fortune 500 – to accelerate innovation while ensuring responsible data use. Privacy and risk teams know that undiscovered personal data in their digital estate puts their business and customers at risk. OneTrust’s Privacy Risk Agent uses Security Copilot, Purview scan logs, Entra ID data, and Jupyter notebook jobs in the Sentinel data lake to automatically discover personal data, assess risk, and take mitigating actions. To learn more, visit here. Tanium. The Tanium Security Triage Agent accelerates alert triage using real-time endpoint intelligence from Tanium. Tanium intends to expand its agent to ingest contextual identity data from Microsoft Entra using Sentinel data lake. Discover how Tanium’s integrations empower IT and security teams to make faster, more informed decisions. Simbian. Simbian’s Threat Hunt Agent makes hunters more effective by automating the process of validating threat hunt hypotheses with AI. Threat hunters provide a hypothesis in natural language, and the Agent queries and analyzes the full breadth of data available in Sentinel data lake to validate the hypothesis and do deep investigation. Simbian's AI SOC Agent investigates and responds to security alerts from Sentinel, Defender, and other alert sources and also uses Sentinel data lake to enhance the depth of investigations. Learn more here. Lumen. Lumen’s Defender℠ Threat Feed for Microsoft Sentinel helps customers correlate known-bad artifacts with activity in their environment. Lumen’s Black Lotus Labs® harnesses unmatched network visibility and machine intelligence to produce high-confidence indicators that can be operationalized at scale for detection and investigation. Currently Lumen’s Defender℠ Threat Feed for Microsoft Sentinel is available as an invite only preview. To request an invite, reach out to the Lumen Defender Threat Feed Sales team. The updated Sentinel Visual Studio Code extension for Microsoft Sentinel The Sentinel Extension for Visual Studio code brings new AI and packaging capabilities on top of existing Jupyter notebook jobs to help developers efficiently create new solutions. Building with AI Impactful AI security solutions need access and understanding of relevant security data to address a scenario. The new Microsoft Sentinel Model Context Protocol (MCP) server makes data in Sentinel data lake AI-discoverable and understandable to agents so they can reason over it to generate powerful new insights. It integrates with the Sentinel VS Code extension so developers can use those tools to explore the data in the lake and have agents use those tools as they do their work. To learn more, read the Microsoft Sentinel MCP server announcement. Microsoft is also releasing MCP tools to make creating AI agents more straightforward. Developers can use Security Copilot’s MCP tools to create agents within either the Sentinel VS Code extension or the environment of their choice. They can also take advantage of the low code agent authoring experience right in the Security Copilot portal. To learn more about the Security Copilot pro code and low code agent authoring experiences visit the Security Copilot blog post on Building your own Security Copilot agents. Jupyter Notebook Jobs Jupyter notebooks jobs are an important part of the Sentinel data lake and were launched at our public preview a couple of months ago. See the documentation here for more details on Jupyter notebooks jobs and how they can be used in a solution. Note that when jobs write to the data lake, agents can use the Sentinel MCP tools to read and act on those results in the same way they’re able to read any data in the data lake. Packaging and Publishing Developers can now package solutions containing notebook jobs and Copilot agents so they can be distributed through the new Microsoft Security Store. With just a few clicks in the Sentinel VS Code extension, a developer can create a package which they can then upload to Security Store. Distribution and revenue opportunities with Security Store Sentinel platform solutions can be packaged and offered through the new Microsoft Security Store, which gives partners new ways to grow revenue and reach customers. Learn more about the ways Microsoft Security Store can help developers reach customers and grow revenue by visiting securitystore.microsoft.com. Getting started Developers can get started building powerful applications that bring together Sentinel data, Jupyter notebook jobs, and Security Copilot today: Become a partner to publish solutions to Microsoft Security Store Onboarding to Sentinel data lake Downloading the Sentinel Visual Studio Code extension Learn about Security Copilot news Learn about Microsoft Security StoreIntroducing Microsoft Security Store
Security is being reengineered for the AI era—moving beyond static, rulebound controls and after-the-fact response toward platform-led, machine-speed defense. We recognize that defending against modern threats requires the full strength of an ecosystem, combining our unique expertise and shared threat intelligence. But with so many options out there, it’s tough for security professionals to cut through the noise, and even tougher to navigate long procurement cycles and stitch together tools and data before seeing meaningful improvements. That’s why we built Microsoft Security Store - a storefront designed for security professionals to discover, buy, and deploy security SaaS solutions and AI agents from our ecosystem partners such as Darktrace, Illumio, and BlueVoyant. Security SaaS solutions and AI agents on Security Store integrate with Microsoft Security products, including Sentinel platform, to enhance end-to-end protection. These integrated solutions and agents collaborate intelligently, sharing insights and leveraging AI to enhance critical security tasks like triage, threat hunting, and access management. In Security Store, you can: Buy with confidence – Explore solutions and agents that are validated to integrate with Microsoft Security products, so you know they’ll work in your environment. Listings are organized to make it easy for security professionals to find what’s relevant to their needs. For example, you can filter solutions based on how they integrate with your existing Microsoft Security products. You can also browse listings based on their NIST Cybersecurity Framework functions, covering everything from network security to compliance automation — helping you quickly identify which solutions strengthen the areas that matter most to your security posture. Simplify purchasing – Buy solutions and agents with your existing Microsoft billing account without any additional payment setup. For Azure benefit-eligible offers, eligible purchases contribute to your cloud consumption commitments. You can also purchase negotiated deals through private offers. Accelerate time to value – Deploy agents and their dependencies in just a few steps and start getting value from AI in minutes. Partners offer ready-to-use AI agents that can triage alerts at scale, analyze and retrieve investigation insights in real time, and surface posture and detection gaps with actionable recommendations. A rich ecosystem of solutions and AI agents to elevate security posture In Security Store, you’ll find solutions covering every corner of cybersecurity—threat protection, data security and governance, identity and device management, and more. To give you a flavor of what is available, here are some of the exciting solutions on the store: Darktrace’s ActiveAI Security SaaS solution integrates with Microsoft Security to extend self-learning AI across a customer's entire digital estate, helping detect anomalies and stop novel attacks before they spread. The Darktrace Email Analysis Agent helps SOC teams triage and threat hunt suspicious emails by automating detection of risky attachments, links, and user behaviors using Darktrace Self-Learning AI, integrated with Microsoft Defender and Security Copilot. This unified approach highlights anomalous properties and indicators of compromise, enabling proactive threat hunting and faster, more accurate response. Illumio for Microsoft Sentinel combines Illumio Insights with Microsoft Sentinel data lake and Security Copilot to enhance detection and response to cyber threats. It fuses data from Illumio and all the other sources feeding into Sentinel to deliver a unified view of threats across millions of workloads. AI-driven breach containment from Illumio gives SOC analysts, incident responders, and threat hunters unified visibility into lateral traffic threats and attack paths across hybrid and multi-cloud environments, to reduce alert fatigue, prioritize threat investigation, and instantly isolate workloads. Netskope’s Security Service Edge (SSE) platform integrates with Microsoft M365, Defender, Sentinel, Entra and Purview for identity-driven, label-aware protection across cloud, web, and private apps. Netskope's inline controls (SWG, CASB, ZTNA) and advanced DLP, with Entra signals and Conditional Access, provide real-time, context-rich policies based on user, device, and risk. Telemetry and incidents flow into Defender and Sentinel for automated enrichment and response, ensuring unified visibility, faster investigations, and consistent Zero Trust protection for cloud, data, and AI everywhere. PERFORMANTA Email Analysis Agent automates deep investigations into email threats, analyzing metadata (headers, indicators, attachments) against threat intelligence to expose phishing attempts. Complementing this, the IAM Supervisor Agent triages identity risks by scrutinizing user activity for signs of credential theft, privilege misuse, or unusual behavior. These agents deliver unified, evidence-backed reports directly to you, providing instant clarity and slashing incident response time. Tanium Autonomous Endpoint Management (AEM) pairs realtime endpoint visibility with AI-driven automation to keep IT environments healthy and secure at scale. Tanium is integrated with the Microsoft Security suite—including Microsoft Sentinel, Defender for Endpoint, Entra ID, Intune, and Security Copilot. Tanium streams current state telemetry into Microsoft’s security and AI platforms and lets analysts pivot from investigation to remediation without tool switching. Tanium even executes remediation actions from the Sentinel console. The Tanium Security Triage Agent accelerates alert triage, enabling security teams to make swift, informed decisions using Tanium Threat Response alerts and real-time endpoint data. Walkthrough of Microsoft Security Store Now that you’ve seen the types of solutions available in Security Store, let’s walk through how to find the right one for your organization. You can get started by going to the Microsoft Security Store portal. From there, you can search and browse solutions that integrate with Microsoft Security products, including a dedicated section for AI agents—all in one place. If you are using Microsoft Security Copilot, you can also open the store from within Security Copilot to find AI agents - read more here. Solutions are grouped by how they align with industry frameworks like NIST CSF 2.0, making it easier to see which areas of security each one supports. You can also filter by integration type—e.g., Defender, Sentinel, Entra, or Purview—and by compliance certifications to narrow results to what fits your environment. To explore a solution, click into its detail page to view descriptions, screenshots, integration details, and pricing. For AI agents, you’ll also see the tasks they perform, the inputs they require, and the outputs they produce —so you know what to expect before you deploy. Every listing goes through a review process that includes partner verification, security scans on code packages stored in a secure registry to protect against malware, and validation that integrations with Microsoft Security products work as intended. Customers with the right permissions can purchase agents and SaaS solutions directly through Security Store. The process is simple: choose a partner solution or AI agent and complete the purchase in just a few clicks using your existing Microsoft billing account—no new payment setup required. Qualifying SaaS purchases also count toward your Microsoft Azure Consumption Commitment (MACC), helping accelerate budget approvals while adding the security capabilities your organization needs. Security and IT admins can deploy solutions directly from Security Store in just a few steps through a guided experience. The deployment process automatically provisions the resources each solution needs—such as Security Copilot agents and Microsoft Sentinel data lake notebook jobs—so you don’t have to do so manually. Agents are deployed into Security Copilot, which is built with security in mind, providing controls like granular agent permissions and audit trails, giving admins visibility and governance. Once deployment is complete, your agent is ready to configure and use so you can start applying AI to expand detection coverage, respond faster, and improve operational efficiency. Security and IT admins can view and manage all purchased solutions from the “My Solutions” page and easily navigate to Microsoft Cost Management tools to track spending and manage subscriptions. Partners: grow your business with Microsoft For security partners, Security Store opens a powerful new channel to reach customers, monetize differentiated solutions, and grow with Microsoft. We will showcase select solutions across relevant Microsoft Security experiences, starting with Security Copilot, so your offerings appear in the right context for the right audience. You can monetize both SaaS solutions and AI agents through built-in commerce capabilities, while tapping into Microsoft’s go-to-market incentives. For agent builders, it’s even simpler—we handle the entire commerce lifecycle, including billing and entitlement, so you don’t have to build any infrastructure. You focus on embedding your security expertise into the agent, and we take care of the rest to deliver a seamless purchase experience for customers. Security Store is built on top of Microsoft Marketplace, which means partners publish their solution or agent through the Microsoft Partner Center - the central hub for managing all marketplace offers. From there, create or update your offer with details about how your solution integrates with Microsoft Security so customers can easily discover it in Security Store. Next, upload your deployable package to the Security Store registry, which is encrypted for protection. Then define your license model, terms, and pricing so customers know exactly what to expect. Before your offer goes live, it goes through certification checks that include malware and virus scans, schema validation, and solution validation. These steps help give customers confidence that your solutions meet Microsoft’s integration standards. Get started today By creating a storefront optimized for security professionals, we are making it simple to find, buy, and deploy solutions and AI agents that work together. Microsoft Security Store helps you put the right AI‑powered tools in place so your team can focus on what matters most—defending against attackers with speed and confidence. Get started today by visiting Microsoft Security Store. If you’re a partner looking to grow your business with Microsoft, start by visiting Microsoft Security Store - Partner with Microsoft to become a partner. Partners can list their solution or agent if their solution has a qualifying integration with Microsoft Security products, such as a Sentinel connector or Security Copilot agent, or another qualifying MISA solution integration. You can learn more about qualifying integrations and the listing process in our documentation here.Announcing Microsoft Sentinel Model Context Protocol (MCP) server – Public Preview
Security is being reengineered for the AI era—moving beyond static, rulebound controls and after-the-fact response toward platform-led, machine-speed defense. The challenge is clear: fragmented tools, sprawling signals, and legacy architectures that can’t match the velocity and scale of modern attacks. What’s needed is an AI-ready, data-first foundation—one that turns telemetry into a security graph, standardizes access for agents, and coordinates autonomous actions while keeping humans in command of strategy and high-impact investigations. Security teams already center operations on their SIEM for end-to-end visibility, and we’re advancing that foundation by evolving Microsoft Sentinel into both the SIEM and the platform for agentic defense—connecting analytics and context across ecosystems. And now, we’re introducing new platform capabilities that build on Sentinel data lake: Sentinel graph for deeper insight and context; an MCP server and tools to make data agent ready; new developer capabilities; and Security Store for effortless discovery and deployment—so protection accelerates to machine speed while analysts do their best work. Introducing Sentinel MCP server We’re excited to announce the public preview of Microsoft Sentinel MCP (Model Context Protocol) server, a fully managed cloud service built on an open standard that lets AI agents seamlessly access the rich security context in your Sentinel data lake. Recent advances in large language models have enabled AI agents to perform reasoning—breaking down complex tasks, inferring patterns, and planning multistep actions, making them capable of autonomously performing business processes. To unlock this potential in cybersecurity, agents must operate with your organization’s real security context, not just public training data. Sentinel MCP server solves that by providing standardized, secure access to that context—across graph relationships, tabular telemetry, and vector embeddings—via reusable, natural language tools, enabling security teams to unlock the full potential of AI-driven automation and focus on what matters most. Why Model Context Protocol (MCP)? Model Context Protocol (MCP) is a rapidly growing open standard that allows AI models to securely communicate with external applications, services, and data sources through a well-structured interface. Think of MCP as a bridge that lets an AI agents understand and invoke an application’s capabilities. These capabilities are exposed as discrete “tools” with natural language inputs and outputs. The AI agent can autonomously choose the right tool (or combination of tools) for the task it needs to accomplish. In simpler terms, MCP standardizes how an AI talks to systems. Instead of developers writing custom connectors for each application, the MCP server presents a menu of available actions to the AI in a language it understands. This means an AI agent can discover what it can do (search data, run queries, trigger actions, etc.) and then execute those actions safely and intelligently. By adopting an open standard like MCP, Microsoft is ensuring that our AI integrations are interoperable and future-proof. Any AI application that speaks MCP can connect. Security Copilot offers built-in integration, while other MCP-compatible platforms can leverage your Sentinel data and services can quickly connect by simply adding a new MCP server and typing Sentinel’s MCP server URL. How to Get Started Sentinel MCP server is a fully managed service now available to all Sentinel data lake customers. If you are already onboarded to Sentinel data lake, you are ready to begin using MCP. Not using Sentinel data lake yet? Learn more here. Currently, you can connect to the Sentinel MCP server using Visual Studio Code (VS Code) with the GitHub Copilot add-on. Here’s a step-by-step guide: Open VS Code and authenticate with an account that has at least Security Reader role access (required to query the data lake via the Sentinel MCP server) Open the Command Palette in VS Code (Ctrl + Shift + P) Type or select “MCP: Add Server…” Choose “HTTP” (HTTP or Server-Sent Event) Enter the Sentinel MCP server URL: “https://sentinel.microsoft.com/mcp/data-exploration" When prompted, allow authentication with the Sentinel MCP server by clicking “Allow” Once connected, GitHub Copilot will be linked to Sentinel MCP server. Open the agent pane, set it to Agent mode (Ctrl + Shift + I), and you are ready to go. GitHub Copilot will autonomously identify and utilize Sentinel MCP tools as necessary. You can now experience how AI agents powered by the Sentinel MCP server access and utilize your security context using natural language, without knowing KQL, which tables to query and wrangle complex schemas. Try prompts like: “Find the top 3 users that are at risk and explain why they are at risk.” “Find sign-in failures in the last 24 hours and give me a brief summary of key findings.” “Identify devices that showed an outstanding amount of outgoing network connections.” To learn more about the existing capabilities of Sentinel MCP tools, refer to our documentation. Security Copilot will also feature native integration with Sentinel MCP server; this seamless connectivity will enhance autonomous security agents and the open prompting experience. Check out the Security Copilot blog for additional details. What’s coming next? The public preview of Sentinel MCP server marks the beginning of a new era in cybersecurity—one where AI agents operate with full context, precision, and autonomy. In the coming months, the MCP toolset will expand to support natural language access across tabular, graph, and embedding-based data, enabling agents to reason over both structured and unstructured signals. This evolution will dramatically boost agentic performance, allowing them to handle more complex tasks, surface deeper insights, and accelerate protection to machine speed. As we continue to build out this ecosystem, your feedback will be essential in shaping its future. Be sure to check out the Microsoft Secure for a deeper dive and live demo of Sentinel MCP server in action.Secure Model Context Protocol (MCP) Implementation with Azure and Local Servers
Introduction The Model Context Protocol (MCP) enables AI systems to interact with external data sources and tools through a standardized interface. While powerful, MCP can introduce security risks in enterprise environments. This tutorial shows you how to implement MCP securely using local servers, Azure OpenAI with APIM, and proper authentication. Understanding MCP's Security Risks There are a couple of key security concerns to consider before implementing MCP: Data Exfiltration: External MCP servers could expose sensitive data. Unauthorized Access: Third-party services become potential security risks. Loss of Control: Unknown how external services handle your data. Compliance Issues: Difficulty meeting regulatory requirements with external dependencies. The solution? Keep everything local and controlled. Secure Architecture Before we dive into implementation, let's take a look at the overall architecture of our secure MCP solution: This architecture consists of three key components working together: Local MCP Server - Your custom tools run entirely within your local environment, reducing external exposure risks. Azure OpenAI + APIM Gateway - All AI requests are routed through Azure API Management with Microsoft Entra ID authentication, providing enterprise-grade security controls and compliance. Authenticated Proxy - A lightweight proxy service handles token management and request forwarding, ensuring seamless integration. One of the key benefits of this architecture is that no API key is required. Traditional implementations often require storing OpenAI API keys in configuration files, environment variables, or secrets management systems, creating potential security vulnerabilities. This approach uses Azure Managed Identity for backend authentication and Azure CLI credentials for client authentication, meaning no sensitive API keys are ever stored, logged, or exposed in your codebase. For more security, APIM and Azure OpenAI resources can be configured with IP restrictions or network rules to only accept traffic from certain sources. These configurations are available for most Azure resources and provide an additional layer of network-level security. This security-forward approach gives you the full power of MCP's tool integration capabilities while keeping your implementation completely under your control. How to Implement MCP Securely 1. Local MCP Server Implementation Building the MCP Server Let's start by creating a simple MCP server in .NET Core. 1. Create a web application dotnet new web -n McpServer 2.Add MCP packages dotnet add package ModelContextProtocol --prerelease dotnet add package ModelContextProtocol.AspNetCore --prerelease 3. Configure Program.cs var builder = WebApplication.CreateBuilder(args); builder.Services.AddMcpServer() .WithHttpTransport() .WithToolsFromAssembly(); var app = builder.Build(); app.MapMcp(); app.Run(); WithToolsFromAssembly() automatically discovers and registers tools from the current assembly. Look into the C# SDK for other ways to register tools for your use case. 4. Define Tools Now, we can define some tools that our MCP server can expose. here is a simple example for tools that echo input back to the client: using ModelContextProtocol.Server; using System.ComponentModel; namespace Tools; [McpServerToolType] public static class EchoTool { [McpServerTool] [Description("Echoes the input text back to the client in all capital letters.")] public static string EchoCaps(string input) { return new string(input.ToUpperInvariant()); } [McpServerTool] [Description("Echoes the input text back to the client in reverse.")] public static string ReverseEcho(string input) { return new string(input.Reverse().ToArray()); } } Key components of MCP tools are the McpServerToolType class decorator indicating that this class contains MCP tools, and the McpServerTool method decorator with a description that explains what the tool does. Alternative: STDIO Transport If you want to use STDIO transport instead of SSE (implemented here), check out this guide: Build a Model Context Protocol (MCP) Server in C# 2. Create a MCP Client with Cline Now that we have our MCP server set up with tools, we need a client that can discover and invoke these tools. For this implementation, we'll use Cline as our MCP client, configured to work through our secure Azure infrastructure. 1. Install Cline VS Code Extension Install the Cline extension in VS Code. 2. Deploy secure Azure OpenAI Endpoint with APIM Instead of connecting Cline directly to external AI services (which could expose the secure implementation to external bad actors), we will route through Azure API Management (APIM) for enterprise security. With this implementation, all requests go through Microsoft Entra ID and we use managed identity for all authentications. Quick Setup: Deploy the Azure OpenAI with APIM solution. Ensure your Azure OpenAI resources are configured to allow your APIM's managed identity to make calls. The APIM policy below uses managed identity authentication to connect to Azure OpenAI backends. Refer to the Azure OpenAI documentation on managed identity authentication for detailed setup instructions. 3. Configure APIM Policy After deploying APIM, configure the following policy to enable Azure AD token validation, managed identity authentication, and load balancing across multiple OpenAI backends: <!-- Azure API Management Policy for OpenAI Endpoint --> <!-- Implements Azure AD Token validation, managed identity authentication --> <!-- Supports round-robin load balancing across multiple OpenAI backends --> <!-- Requests with 'gpt-5' in the URL are routed to a single backend --> <!-- The client application ID '04b07795-8ddb-461a-bbee-02f9e1bf7b46' is the official Azure CLI app registration --> <!-- This policy allows requests authenticated by Azure CLI (az login) when the required claims are present --> <policies> <inbound> <!-- IP Allow List Fragment (external fragment for client IP restrictions) --> <include-fragment fragment-id="YourCompany-IPAllowList" /> <!-- Azure AD Token Validation for Azure CLI app ID --> <validate-azure-ad-token tenant-id="YOUR-TENANT-ID-HERE" header-name="Authorization" failed-validation-httpcode="401" failed-validation-error-message="Unauthorized. Access token is missing or invalid."> <client-application-ids> <application-id>04b07795-8ddb-461a-bbee-02f9e1bf7b46</application-id> </client-application-ids> <audiences> <audience>api://YOUR-API-AUDIENCE-ID-HERE</audience> </audiences> <required-claims> <claim name="roles" match="any"> <value>YourApp.User</value> </claim> </required-claims> </validate-azure-ad-token> <!-- Acquire Managed Identity access token for backend authentication --> <authentication-managed-identity resource="https://cognitiveservices.azure.com" output-token-variable-name="managed-id-access-token" ignore-error="false" /> <!-- Set Authorization header for backend using the managed identity token --> <set-header name="Authorization" exists-action="override"> <value>@("Bearer " + (string)context.Variables["managed-id-access-token"])</value> </set-header> <!-- Check if URL contains 'gpt-5' and set backend accordingly --> <choose> <when condition="@(context.Request.Url.Path.ToLower().Contains("gpt-5"))"> <set-variable name="selected-backend-url" value="https://your-region1-oai.openai.azure.com/openai" /> </when> <otherwise> <cache-lookup-value key="backend-counter" variable-name="backend-counter" /> <choose> <when condition="@(context.Variables.ContainsKey("backend-counter") == false)"> <set-variable name="backend-counter" value="@(0)" /> </when> </choose> <set-variable name="current-backend-index" value="@((int)context.Variables["backend-counter"] % 7)" /> <choose> <when condition="@((int)context.Variables["current-backend-index"] == 0)"> <set-variable name="selected-backend-url" value="https://your-region1-oai.openai.azure.com/openai" /> </when> <when condition="@((int)context.Variables["current-backend-index"] == 1)"> <set-variable name="selected-backend-url" value="https://your-region2-oai.openai.azure.com/openai" /> </when> <when condition="@((int)context.Variables["current-backend-index"] == 2)"> <set-variable name="selected-backend-url" value="https://your-region3-oai.openai.azure.com/openai" /> </when> <when condition="@((int)context.Variables["current-backend-index"] == 3)"> <set-variable name="selected-backend-url" value="https://your-region4-oai.openai.azure.com/openai" /> </when> <when condition="@((int)context.Variables["current-backend-index"] == 4)"> <set-variable name="selected-backend-url" value="https://your-region5-oai.openai.azure.com/openai" /> </when> <when condition="@((int)context.Variables["current-backend-index"] == 5)"> <set-variable name="selected-backend-url" value="https://your-region6-oai.openai.azure.com/openai" /> </when> <when condition="@((int)context.Variables["current-backend-index"] == 6)"> <set-variable name="selected-backend-url" value="https://your-region7-oai.openai.azure.com/openai" /> </when> </choose> <set-variable name="next-counter" value="@(((int)context.Variables["backend-counter"] + 1) % 1000)" /> <cache-store-value key="backend-counter" value="@((int)context.Variables["next-counter"])" duration="300" /> </otherwise> </choose> <!-- Always set backend service using selected-backend-url variable --> <set-backend-service base-url="@((string)context.Variables["selected-backend-url"])" /> <!-- Inherit any base policies defined outside this section --> <base /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> </on-error> </policies> This policy creates a secure gateway that validates Azure AD tokens from your local Azure CLI session, then uses APIM's managed identity to authenticate with Azure OpenAI backends, eliminating the need for API keys. It automatically load-balances requests across multiple Azure OpenAI regions using round-robin distribution for optimal performance. 4. Create Azure APIM proxy for Cline This FastAPI-based proxy forwards OpenAI-compatible API requests from Cline through APIM using Azure AD authentication via Azure CLI credentials, eliminating the need to store or manage OpenAI API keys. Prerequisites: Python 3.8 or higher Azure CLI (ensure az login has been run at least once) Ensure the user running the proxy script has appropriate Azure AD roles and permissions. This script uses Azure CLI credentials to obtain bearer tokens. Your user account must have the correct roles assigned and access to the target API audience configured in the APIM policy above. Quick setup for the proxy: Create this requirements.txt: fastapi uvicorn requests azure-identity Create this Python script for the proxy source code azure_proxy.py: import os import requests from fastapi import FastAPI, Request from fastapi.responses import StreamingResponse import uvicorn from azure.identity import AzureCliCredential # CONFIGURATION APIM_BASE_URL = <APIM BASE URL HERE> AZURE_SCOPE = <AZURE SCOPE HERE> PORT = int(os.environ.get("PORT", 8080)) app = FastAPI() credential = AzureCliCredential() # Use a single requests.Session for connection pooling from requests.adapters import HTTPAdapter session = requests.Session() session.mount("https://", HTTPAdapter(pool_connections=100, pool_maxsize=100)) import time _cached_token = None _cached_expiry = 0 def get_bearer_token(scope: str) -> str: """Get an access token using AzureCliCredential, caching until expiry is within 30 seconds.""" global _cached_token, _cached_expiry now = int(time.time()) if _cached_token and (_cached_expiry - now > 30): return _cached_token try: token_obj = credential.get_token(scope) _cached_token = token_obj.token _cached_expiry = token_obj.expires_on return _cached_token except Exception as e: raise RuntimeError(f"Could not get Azure access token: {e}") @app.api_route("/{path:path}", methods=["GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"]) async def proxy(request: Request, path: str): # Assemble the destination URL (preserve trailing slash logic) dest_url = f"{APIM_BASE_URL.rstrip('/')}/{path}".rstrip("/") if request.url.query: dest_url += "?" + request.url.query # Get the Bearer token bearer_token = get_bearer_token(AZURE_SCOPE) # Prepare headers (copy all, overwrite Authorization) headers = dict(request.headers) headers["Authorization"] = f"Bearer {bearer_token}" headers.pop("host", None) # Read body body = await request.body() # Send the request to APIM using the pooled session resp = session.request( method=request.method, url=dest_url, headers=headers, data=body if body else None, stream=True, ) # Stream the response back to the client return StreamingResponse( resp.raw, status_code=resp.status_code, headers={k: v for k, v in resp.headers.items() if k.lower() != "transfer-encoding"}, ) if __name__ == "__main__": # Bind the app to 127.0.0.1 to avoid any Firewall updates uvicorn.run(app, host="127.0.0.1", port=PORT) Run the setup: pip install -r requirements.txt az login # Authenticate with Azure python azure_proxy.py Configure Cline to use the proxy: Using the OpenAI Compatible API Provider: Base URL: http://localhost:8080 API Key: <any random string> Model ID: <your Azure OpenAI deployment name> API Version: <your Azure OpenAI deployment version> The API key field is required by Cline but unused in our implementation - any random string works since authentication happens via Azure AD. 5. Configure Cline to listen to your MCP Server Now that we have both our MCP server running and Cline configured with secure OpenAI access, the final step is connecting them together. To enable Cline to discover and use your custom tools, navigate to your installed MCP servers on Cline, select Configure MCP Servers, and add in the configuration for your server: { "mcpServers": { "mcp-tools": { "autoApprove": [ "EchoCaps", "ReverseEcho", ], "disabled": false, "timeout": 60, "type": "sse", "url": "http://<your localhost url>/sse" } } } Now, you can use Cline's chat interface to interact with your secure MCP tools. Try asking Cline to use your custom tools - for example, "Can you echo 'Hello World' in capital letters?" and watch as it calls your local MCP server through the infrastructure you've built. Conclusion There you have it: A secure implementation of MCP that can be tailored to your specific use case. This approach gives you the power of MCP while maintaining enterprise security. You get: AI capabilities through secure Azure infrastructure. Custom tools that never leave your environment. Standard MCP interface for easy integration. Complete control over your data and tools. The key is keeping MCP servers local while routing AI requests through your secure Azure infrastructure. This way, you gain MCP's benefits without compromising security. Disclaimer While this tutorial provides a secure foundation for MCP implementation, organizations are responsible for configuring their Azure resources according to their specific security requirements and compliance standards. Ensure proper review of network rules, access policies, and authentication configurations before deploying to production environments. Resources MCP SDKs and Tools: MCP C# SDK MCP Python SDK Cline SDK Cline User Guide Azure OpenAI with APIM Azure API Management Network Security: Azure API Management - restrict caller IPs Azure API Management with an Azure virtual network Set up inbound private endpoint for Azure API Management Azure OpenAI and AI Services Network Security: Configure Virtual Networks for Azure AI services Securing Azure OpenAI inside a virtual network with private endpoints Add an Azure OpenAI network security perimeter az cognitiveservices account network-rule1.3KViews3likes2CommentsCybersecurity: What Every Business Leader Needs to Know Now
As a Senior Cybersecurity Solution Architect, I’ve had the privilege of supporting organisations across the United Kingdom, Europe, and the United States—spanning sectors from finance to healthcare—in strengthening their security posture. One thing has become abundantly clear: cybersecurity is no longer the sole domain of IT departments. It is a strategic imperative that demands attention at board-level. This guide distils five key lessons drawn from real-world engagements to help executive leaders navigate today’s evolving threat landscape. These insights are not merely technical—they are cultural, operational, and strategic. If you’re a C-level executive, this article is a call to action: reassess how your organisation approaches cybersecurity before the next breach forces the conversation. In this article, I share five lessons (and quotes) from the field that help demystify how to enhance an organisation’s security posture. 1. Shift the Mindset “This has always been our approach, and we’ve never experienced a breach—so why should we change it?” A significant barrier to effective cybersecurity lies not in the sophistication of attackers, but in the predictability of human behaviour. If you’ve never experienced a breach, it’s tempting to maintain the status quo. However, as threats evolve, so too must your defences. Many cyber threats exploit well-known vulnerabilities that remain unpatched or rely on individuals performing routine tasks in familiar ways. Human nature tends to favour comfort and habit—traits that adversaries are adept at exploiting. Unlike many organisations, attackers readily adopt new technologies to advance their objectives, including AI-powered ransomware to execute increasingly sophisticated attacks. It is therefore imperative to recognise—without delay—that the advent of AI has dramatically reduced both the effort and time required to compromise systems. As the UK’s National Cyber Security Centre (NCSC) has stated: “AI lowers the barrier for novice cyber criminals, hackers-for-hire and hacktivists to carry out effective access and information gathering operations. This enhanced access will likely contribute to the global ransomware threat over the next two years.” Similarly, McKinsey & Company observed: “As AI quickly advances cyber threats, organisations seem to be taking a more cautious approach, balancing the benefits and risks of the new technology while trying to keep pace with attackers’ increasing sophistication.” To counter this evolving threat landscape, organisations must proactively leverage AI in their cyber defence strategies. Examples include: Identity and Access Management (IAM): AI enhances IAM by analysing real-time signals across systems to detect risky sign-ins and enforce adaptive access controls. Example: Microsoft Entra Agents for Conditional Access use AI to automate policy recommendations, streamlining access decisions with minimal manual input. Figure 1: Microsoft Entra Agents Threat Detection: AI accelerates detection, response, and recovery, helping organisations stay ahead of sophisticated threats. Example: Microsoft Defender for Cloud’s AI threat protection identifies prompt injection, data poisoning, and wallet attacks in real time. Incident Response: AI facilitates real-time decision-making, removing emotional bias and accelerating containment and recovery during security incidents. Example: Automatic Attack Disruption in Defender XDR, which can automatically contain a breach in progress. AI Security Posture Management AI workloads require continuous discovery, classification, and protection across multi-cloud environments. Example: Microsoft Defender for Cloud’s AI Security Posture Management secures custom AI apps across Azure, AWS, and GCP by detecting misconfigurations, vulnerabilities, and compliance gaps. Data Security Posture Management (DSPM) for AI AI interactions must be governed to ensure privacy, compliance, and insider risk mitigation. Example: Microsoft Purview DSPM for AI enables prompt auditing, applies Data Loss Prevention (DLP) policies to third-party AI apps like ChatGPT, and supports eDiscovery and lifecycle management. AI Threat Protection Organisations must address emerging AI threat vectors, including prompt injection, data leakage, and model exploitation. Example: Defender for AI (private preview) provides model-level security, including governance, anomaly detection, and lifecycle protection. Embracing innovation, automation, and intelligent defence is the secret sauce for cyber resilience in 2026. 2. Avoid One-Off Purchases – Invest with a Strategy “One MDE and one Sentinel to go, please.” Organisations often approach me intending to purchase a specific cybersecurity product—such as Microsoft Defender for Endpoint (MDE)—without a clearly articulated strategic rationale. My immediate question is: what is the broader objective behind this purchase? Is it driven by perceived value or popularity, or does it form part of a well-considered strategy to enhance endpoint security? Cybersecurity investments should be guided by a long-term, holistic strategy that spans multiple years and is periodically reassessed to reflect evolving threats. Strengthening endpoint protection must be integrated into a wider effort to improve the organisation’s overall security posture. This includes ensuring seamless integration between security solutions and avoiding operational silos. For example, deploying robust endpoint protection is of limited value if identities are not safeguarded with multi-factor authentication (MFA), or if storage accounts remain publicly accessible. A cohesive and forward-looking approach ensures that all components of the security architecture work in concert to mitigate risk effectively. Security Adoption Journey (Based on Zero Trust Framework) Assess – Evaluate the threat landscape, attack surface, vulnerabilities, compliance obligations, and critical assets. Align – Link security objectives to broader business goals to ensure strategic coherence. Architect – Design integrated and scalable security solutions, addressing gaps and eliminating operational silos. Activate – Implement tools with robust governance and automation to ensure consistent policy enforcement. Advance – Continuously monitor, test, and refine the security posture to stay ahead of evolving threats. Security tools are not fast food—they work best as part of a long-term plan, not a one-off order. This piecemeal approach runs counter to the modern Zero Trust security model, which assumes no single tool will prevent every breach and instead implements layered defences and integration. 3. Legacy Systems Are Holding You Back “Unfortunately, we are unable to implement phishing-resistant MFA, as our legacy app does not support integration with the required protocols.” A common challenge faced by many organisations I have worked with is the constraint on innovation within their cybersecurity architecture, primarily due to continued reliance on legacy applications—often driven by budgetary or operational necessity. These outdated systems frequently lack compatibility with modern security technologies and may introduce significant vulnerabilities. A notable example is the deployment of phishing-resistant multi-factor authentication (MFA)—such as FIDO2 security keys or certificate-based authentication—which requires advanced identity protocols and conditional access policies. These capabilities are available exclusively through Microsoft Entra ID. To address this issue effectively, it is essential to design security frameworks based on the organisation’s future aspirations rather than its current limitations. By adopting a forward-thinking approach, organisations can remain receptive to emerging technologies that align with their strategic cybersecurity objectives. Moreover, this perspective encourages investment in acquiring the necessary talent, thereby reducing reliance on extensive change management and staff retraining. I advise designing for where you want to be in the next 1–3 years—ideally cloud-first and identity-driven—essentially adopting a Zero Trust architecture, rather than being constrained by the limitations of legacy systems. 4. Collaboration Is a Security Imperative “This item will need to be added to the dev team's backlog. Given their current workload, they will do their best to implement GitHub Security in Q3, subject to capacity.” Cybersecurity threats may originate from various parts of an organisation, and one of the principal challenges many face is the fragmented nature of their defence strategies. To effectively mitigate such risks, cybersecurity must be embedded across all departments and functions, rather than being confined to a single team or role. In many organisations, the Chief Information Security Officer (CISO) operates in isolation from other C-level executives, which can limit their influence and complicate the implementation of security measures across the enterprise. Furthermore, some teams may lack the requisite expertise to execute essential security practices. For instance, an R&D lead responsible for managing developers may not possess the necessary skills in DevSecOps. To address these challenges, it is vital to ensure that the CISO is empowered to act without political or organisational barriers and is supported in implementing security measures across all business units. When the CISO has backing from the COO and HR, initiatives such as MFA rollout happen faster and more thoroughly. Cross-Functional Security Responsibilities Role Security Responsibilities R&D - Adopt DevSecOps practices - Identify vulnerabilities early - Manage code dependencies - Detect exposed secrets - Embed security in CI/CD pipelines CIO - Ensure visibility over organizational data - Implement Data Loss Prevention (DLP) - Safeguard sensitive data lifecycle - Ensure regulatory compliance CTO - Secure cloud environments (CSPM) - Manage SaaS security posture (SSPM) - Ensure hardware and endpoint protection COO - Protect digital assets - Secure domain management - Mitigate impersonation threats - Safeguard digital marketing channels and customer PII Support & Vendors - Deliver targeted training - Prevent social engineering attacks - Improve awareness of threat vectors HR - Train employees on AI-related threats - Manage insider risks - Secure employee data - Oversee cybersecurity across the employee lifecycle Empowering the CISO to act across departments helps organisations shift towards a security-first culture—embedding cybersecurity into every function, not just IT. 5. Compliance Is Not Security “We’re compliant, so we must be secure.” Many organisations mistakenly equate passing audits—such as ISO 27001 or SOC 2—with being secure. While compliance frameworks help establish a baseline for security, they are not a guarantee of protection. Determined attackers are not deterred by audit checklists; they exploit gaps, misconfigurations, and human error regardless of whether an organisation is certified. Moreover, due to the rapidly evolving nature of the cyber threat landscape, compliance frameworks often struggle to keep pace. By the time a standard is updated, attackers may already be exploiting new techniques that fall outside its scope. This lag creates a false sense of security for organisations that rely solely on regulatory checkboxes. Security is a continuous risk management process—not a one-time certification. It must be embedded into every layer of the enterprise and treated with the same urgency as other core business priorities. Compliance may be the starting line, not the finish line. Effective security goes beyond meeting regulatory requirements—it demands ongoing vigilance, adaptability, and a proactive mindset. Conclusion: Cybersecurity Is a Continuous Discipline Cybersecurity is not a destination—it is a continuous journey. By embracing strategic thinking, cross-functional collaboration, and emerging technologies, organisations can build resilience against today’s threats and tomorrow’s unknowns. The lessons shared throughout this article are not merely technical—they are cultural, operational, and strategic. If there is one key takeaway, it is this: avoid piecemeal fixes and instead adopt an integrated, future-ready security strategy. Due to the rapidly evolving nature of the cyber threat landscape, compliance frameworks alone cannot keep pace. Security must be treated as a dynamic, ongoing process—one that is embedded into every layer of the enterprise and reviewed regularly. Organisations should conduct periodic security posture reviews, leveraging tools such as Microsoft Secure Score or monthly risk reports, and stay informed about emerging threats through threat intelligence feeds and resources like the Microsoft Digital Defence Report, CISA (Cybersecurity and Infrastructure Security Agency), NCSC (UK National Cyber Security Centre), and other open-source intelligence platforms. As Ann Johnson aptly stated in her blog: “The most prepared organisations are those that keep asking the right questions and refining their approach together.” Cyber resilience demands ongoing investment—in people (through training and simulation drills), in processes (via playbooks and frameworks), and in technology (through updates and adoption of AI-driven defences). To reduce cybersecurity risk over time, resilient organisations must continually refine their approach and treat cybersecurity as an ongoing discipline. The time to act is now. Resources: https://www.ncsc.gov.uk/report/impact-of-ai-on-cyber-threat Defend against cyber threats with AI solutions from Microsoft - Microsoft Industry Blogs Generative AI Cybersecurity Solutions | Microsoft Security Require phishing-resistant multifactor authentication for Microsoft Entra administrator roles - Microsoft Entra ID | Microsoft Learn AI is the greatest threat—and defense—in cybersecurity today. Here’s why. Microsoft Entra Agents - Microsoft Entra | Microsoft Learn Smarter identity security starts with AI https://www.microsoft.com/en-us/security/blog/2025/06/12/cyber-resilience-begins-before-the-crisis/ https://www.microsoft.com/en-us/security/security-insider/threat-landscape/microsoft-digital-defense-report-2023-critical-cybersecurity-challenges https://www.microsoft.com/en-us/security/blog/2025/06/12/cyber-resilience-begins-before-the-crisis/Graph RAG for Security: Insights from a Microsoft Intern
As a software engineering intern at Microsoft Security, I had the exciting opportunity to explore how Graph Retrieval-Augmented Generation (Graph RAG) can enhance data security investigations. This blog post shares my learning journey and insights from working with this evolving technology.Empowering Secure AI Innovation: Data Security and Compliance for AI Agents
As organizations embrace the transformative power of generative AI, agentic AI is quickly becoming a core part of enterprise innovation. Whether organizations are just beginning their AI journey or scaling advanced solutions, one thing is clear: agents are poised to transform every function and workflow across organizations. IDC predicts that over 1 billion new business process agents will be created in the next four years 1 . This surge in AI adoption is empowering employees across roles – from low-code makers to pro-code developers – to build and use AI in new ways. Business leaders are eager to support this momentum, but they also recognize the need to innovate responsibly with AI. Microsoft Purview’s evolution When Microsoft 365 Copilot launched in November 2022, it sparked a wave of excitement and an immediate question: how do we secure and govern the data powering these AI experiences? Microsoft Purview quickly evolved to meet this need, extending its data security and compliance capabilities to the Microsoft 365 Copilot ecosystem. It delivered discoverability, protection, and governance value that helped customers discover data risks such as data oversharing, protect sensitive data to prevent data loss and insider risks, and govern AI usage to meet regulations and policies. Now, as customers move beyond pre-built agents like Copilot to develop their own AI agents and applications, Microsoft Purview has evolved to extend the same data protections built for Microsoft 365 Copilot to AI agents. Today, those protections span the entire development spectrum—from no-code and low-code tools like Copilot Studio to pro-code environments such as Azure AI Foundry. Microsoft Purview helps address challenges across the development spectrum Makers – typically business users or citizen developers who build solutions using low-code or no-code tools – shouldn’t need to become security experts to build AI responsibly. Yet, without proper safeguards, these agents can inadvertently expose sensitive data or violate compliance policies. That is why with Microsoft Purview, security and IT teams can feel confident about the agents being built in their organizations. When makers build agents through the Agent Builder or directly in Copilot Studio, security admins can set up Microsoft Purview’s data security and compliance controls that work behind the scenes to support makers in building secure and compliant agents. These controls automatically enforce policies, monitor data access, and ensure compliance without requiring the maker to become a security expert without requiring makers to take additional actions. In fact, a recent Microsoft study found that 71% of developer decision-makers acknowledge that these constraints result in security trade-offs and development delays 2 . Pro-code developers are under increasing pressure to deliver fast, flexible, and seamlessly integrated solutions, yet data security often becomes a deployment blocker or an afterthought. Building enterprise-grade data security and compliance capabilities from scratch is not only time-consuming but also requires deep domain expertise. This is where Microsoft Purview steps in. As an industry leader in data security and compliance, Purview does the heavy lifting, so developers don’t have to. Now in preview, Purview SDK can be used by developers to embed robust, enterprise-ready data protections directly into their AI applications, instead of building complex security frameworks on their own. The Purview SDK is a comprehensive set of REST APIs, documentation, and code samples, allowing developers to easily incorporate Microsoft Purview’s capabilities into their workflows—regardless of their integrated development environment (IDE). This empowers them to move fast without compromising on security or compliance and at the same time, Microsoft Purview helps security teams remain in control. : By embedding Purview APIs into the IDE, developers help enable their AI apps to be secured and governed at runtime Startups, ISVs, and partners can leverage the Purview SDK to seamlessly integrate Purview’s industry-leading features into their AI agents and applications. This enables their offerings to become Purview-aware, empowering customers to more easily secure and govern data within their AI environments. For example, Christian Veillette, Chief Technology Officer at Arthur Health, a Quisitive customer, states “The synergistic integration of MazikCare, the Quisitive Intelligence Platform, and the data compliance power of Purview SDK, including its DSPM for AI, forms a foundational pillar for trustworthy and safe AI-driven healthcare transformations. This powerful combination ensures continuous oversight and instant enforcement of compliance policies, giving IT leadership full assurance in the output of every AI model and upholding the highest safety standards. By centralizing policy enforcement, security concerns are significantly eased, empowering leadership to confidently steer their organizations through the AI transformation journey.” Microsoft partner, Infotechtion, has also leveraged the new Purview SDK to embed Purview value into their GenAI initiatives. Vivek Bhatt, Infotechtion’s Chief Technology Officer says, “Embedding Purview SDK into Infotechtion's AI governance solution improved trust and security by aligning Gen-AI interactions with Microsoft Purview's enterprise policies.” Microsoft Purview also natively integrates with Azure AI Foundry, enabling seamless, built-in security and compliance for AI workloads without requiring additional development effort. With this integration, signals from Azure AI Foundry are automatically surfaced in Microsoft Purview’s Data Security Posture Management (DSPM) for AI, Insider Risk Management, and compliance solutions. This means security teams can monitor AI usage, detect data risks, and enforce compliance policies across AI agents and applications—whether they’re built in-house or with Azure AI Foundry models. This reinforces Microsoft’s commitment to delivering secure-by-default AI innovation—empowering organizations to scale responsibly with confidence. : Data security admins can now find data security and compliance insights across Microsoft Copilots, agents built with Agent Builder and Copilot Studio, and custom AI apps and agents in Microsoft Purview DSPM for AI. Explore more partner case studies from Ernst & Young and Infosys to see how they’re leveraging Purview SDK. Learn more about Purview SDK and Microsoft Purview for Azure AI Foundry. Unified visibility and control Whether supporting pro-code developers or low-code makers, Microsoft Purview enables organizations to secure and govern AI across organizations. With Purview, security teams can discover data security risks, protect sensitive data against data leakage and insider risks, and govern AI interactions. Discover data security risks With Data Security Posture Management (DSPM) for AI, data security teams can discover detailed data risk insights in AI interactions across Microsoft Copilots, agents built in Agent Builder and Copilot Studio, and custom AI apps and agents. Data security admins can now find data security and compliance insights across Microsoft Copilots, agents built with Agent Builder and Copilot Studio, and custom AI apps and agents all in Microsoft Purview DSPM for AI. Protect sensitive data against data leaks and insider risks In DSPM for AI, data security admins can also get recommended insights to improve their organization’s security posture like minimizing risks of data oversharing. For example, an admin might get a recommendation to set up a data loss prevention (DLP) policy that prevents agents in Microsoft 365 Copilot from using certain labeled documents as grounding data to generate summaries or responses. By setting up this policy, organizations can prevent confidential legal documents—with specific language that could lead to improper guidance—from being summarized. It also ensures that “Internal only” documents aren’t used to create content that might be shared outside the organization. Extend data loss prevention (DLP) policies to agents in Microsoft 365 to protect sensitive data. Agents often pull data from sources like SharePoint and Dataverse, and Microsoft Purview helps protect that data every step of the way. It honors sensitivity labels, enforces access permissions, and applies label inheritance so that AI-generated content carries the same protections as its source. With auto-labeling in Dataverse, sensitive data is classified as soon as it’s ingested—reducing manual effort and maintaining consistent protection. When responses draw from multiple sources with different labels, the most restrictive label is applied to uphold compliance and minimize risk. : Sensitivity labels will be automatically applied to data in Dataverse. : AI-generated responses will inherit and honor the source data’s sensitivity labels. In addition to data and permission controls that help address data oversharing or leakage, security teams also need ways to detect users' risky activities in AI apps and agents that could potentially lead to data security incidents. With risky AI usage indicators, policy template, and analytics report in Microsoft Purview Insider Risk Management, security teams with appropriate permissions can detect risky activities. For example, there could be a departing employee receiving an unusual number of AI responses across Copilots and agents containing sensitive data, deviating from their past activity patterns. Security teams can then effectively detect and respond to these potential incidents to minimize the negative impact. For example, they can configure Adaptive Protection to automatically block a high-risk user from accessing sensitive data. An Insider Risk Management alert from a Risky AI usage policy shows a user with anomalous activities. Govern AI Interactions to detect non-compliant usage Microsoft Purview provides a comprehensive set of tools to govern AI usage and detect non-compliant user activities. AI interactions across Microsoft Copilots, AI apps and agents, are recorded in Audit logs. eDiscovery enables legal and compliance teams with appropriate permissions to collect and review AI-generated content for internal investigations or litigation. Data Lifecycle Management enables teams to set policies to retain or dispose of AI interactions, while Communication Compliance helps detect risky or inappropriate use of AI, such as harmful content or other violations against code-of-conduct policies. Together, these capabilities give organizations the visibility and control they need to innovate responsibly with AI. AI interactions across Microsoft Copilots, AI apps and agents are recorded in Audit logs. AI interactions across Microsoft Copilots, AI apps and agents can be collected and reviewed in eDiscovery. Microsoft Purview Communication Compliance can detect non-compliant content in AI prompts across Microsoft Copilots, AI apps and agents. Securing the Future of AI Innovation — Explore Additional Resources As organizations accelerate their adoption of agentic AI, the need for built-in security and compliance has never been more critical. Microsoft Purview empowers both makers and developers to innovate with confidence—ensuring that every AI interaction is secure, compliant, and aligned with enterprise standards. By embedding protection across the entire development lifecycle, Purview helps organizations unlock the full potential of AI while maintaining the trust, transparency, and control that responsible innovation demands. To dive deeper into how Microsoft Purview supports secure AI development, explore our additional resources, documentation, and integration guides: Learn more about Security for AI solutions on our webpage Learn more about Microsoft Purview SDK Learn more about Purview pricing Get started with Azure AI Foundry Get started with Microsoft Purview 1 IDC, 1 Billion New Logical Applications: More Background, Gary Chen, Jim Mercer, April 2024 https://blogs.idc.com/2025/04/04/the-agentic-evolution-of-enterprise-applications/ 2 Microsoft, AI App Security Quantitative Study, April 2025Fishing for Syslog with Azure Kubernetes and Logstash
Deploy Secure Syslog Collection on Azure Kubernetes Service with Terraform Organizations managing distributed infrastructure face a common challenge: collecting syslog data securely and reliably from various sources. Whether you're aggregating logs from network devices, Linux servers, or applications, you need a solution that scales with your environment while maintaining security standards. This post walks through deploying Logstash on Azure Kubernetes Service (AKS) to collect RFC 5425 syslog messages over TLS. The solution uses Terraform for infrastructure automation and forwards collected logs to Azure Event Hubs for downstream processing. You'll learn how to build a production-ready deployment that integrates with Azure Sentinel, Azure Data Explorer, or other analytics platforms. Solution Architecture The deployment consists of several Azure components working together: Azure Kubernetes Service (AKS): Hosts the Logstash deployment with automatic scaling capabilities Internal Load Balancer: Provides a static IP endpoint for syslog sources within your network Azure Key Vault: Stores TLS certificates for secure syslog transmission Azure Event Hubs: Receives processed syslog data using the Kafka protocol Log Analytics Workspace: Monitors the AKS cluster health and performance Syslog sources send RFC 5425-compliant messages over TLS to the Load Balancer on port 6514. Logstash processes these messages and forwards them to Event Hubs, where they can be consumed by various Azure services or third-party tools. Prerequisites Before starting the deployment, ensure you have these tools installed and configured: Terraform: Version 1.5 or later Azure CLI: Authenticated to your Azure subscription kubectl: For managing Kubernetes resources after deployment Several Azure resources must be created manually before running Terraform, as the configuration references them. This approach provides flexibility in organizing resources across different teams or environments. Step 1: Create Resource Groups Create three resource groups to organize the solution components: az group create --name rg-syslog-prod --location eastus az group create --name rg-network-prod --location eastus az group create --name rg-data-prod --location eastus Each resource group serves a specific purpose: rg-syslog-prod : Contains the AKS cluster, Key Vault, and Log Analytics Workspace rg-network-prod : Holds networking resources (Virtual Network and Subnets) rg-data-prod : Houses the Event Hub Namespace for data ingestion Step 2: Configure Networking Create a Virtual Network with dedicated subnets for AKS and the Load Balancer: az network vnet create \ --resource-group rg-network-prod \ --name vnet-syslog-prod \ --address-prefixes 10.0.0.0/16 \ --location eastus az network vnet subnet create \ --resource-group rg-network-prod \ --vnet-name vnet-syslog-prod \ --name snet-aks-prod \ --address-prefixes 10.0.1.0/24 az network vnet subnet create \ --resource-group rg-network-prod \ --vnet-name vnet-syslog-prod \ --name snet-lb-prod \ --address-prefixes 10.0.2.0/24 The network design uses non-overlapping CIDR ranges to prevent routing conflicts. The Load Balancer subnet will later be assigned the static IP address 10.0.2.100 . Step 3: Set Up Event Hub Namespace Create an Event Hub Namespace with a dedicated Event Hub for syslog data: az eventhubs namespace create \ --resource-group rg-data-prod \ --name eh-syslog-prod \ --location eastus \ --sku Standard az eventhubs eventhub create \ --resource-group rg-data-prod \ --namespace-name eh-syslog-prod \ --name syslog The Standard SKU provides Kafka protocol support, which Logstash uses for reliable message delivery. The namespace automatically includes a RootManageSharedAccessKey for authentication. Step 4: Configure Key Vault and TLS Certificate Create a Key Vault to store the TLS certificate: az keyvault create \ --resource-group rg-syslog-prod \ --name kv-syslog-prod \ --location eastus For production environments, import a certificate from your Certificate Authority: az keyvault certificate import \ --vault-name kv-syslog-prod \ --name cert-syslog-prod \ --file certificate.pfx \ --password <pfx-password> For testing purposes, you can generate a self-signed certificate: az keyvault certificate create \ --vault-name kv-syslog-prod \ --name cert-syslog-prod \ --policy "$(az keyvault certificate get-default-policy)" Important: The certificate's Common Name (CN) or Subject Alternative Name (SAN) must match the DNS name your syslog sources will use to connect to the Load Balancer. Step 5: Create Log Analytics Workspace Set up a Log Analytics Workspace for monitoring the AKS cluster: az monitor log-analytics workspace create \ --resource-group rg-syslog-prod \ --workspace-name log-syslog-prod \ --location eastus Understanding the Terraform Configuration With the prerequisites in place, let's examine the Terraform configuration that automates the remaining deployment. The configuration follows a modular approach, making it easy to customize for different environments. Referencing Existing Resources The Terraform configuration begins by importing references to the manually created resources: data "azurerm_client_config" "current" {} data "azurerm_resource_group" "rg-main" { name = "rg-syslog-prod" } data "azurerm_resource_group" "rg-network" { name = "rg-network-prod" } data "azurerm_resource_group" "rg-data" { name = "rg-data-prod" } data "azurerm_virtual_network" "primary" { name = "vnet-syslog-prod" resource_group_name = data.azurerm_resource_group.rg-network.name } data "azurerm_subnet" "kube-cluster" { name = "snet-aks-prod" resource_group_name = data.azurerm_resource_group.rg-network.name virtual_network_name = data.azurerm_virtual_network.primary.name } data "azurerm_subnet" "kube-lb" { name = "snet-lb-prod" resource_group_name = data.azurerm_resource_group.rg-network.name virtual_network_name = data.azurerm_virtual_network.primary.name } These data sources establish connections to existing infrastructure, ensuring the AKS cluster and Load Balancer deploy into the correct network context. Deploying the AKS Cluster The AKS cluster configuration balances security, performance, and manageability: resource "azurerm_kubernetes_cluster" "primary" { name = "aks-syslog-prod" location = data.azurerm_resource_group.rg-main.location resource_group_name = data.azurerm_resource_group.rg-main.name dns_prefix = "aks-syslog-prod" default_node_pool { name = "default" node_count = 2 vm_size = "Standard_DS2_v2" vnet_subnet_id = data.azurerm_subnet.kube-cluster.id } identity { type = "SystemAssigned" } network_profile { network_plugin = "azure" load_balancer_sku = "standard" network_plugin_mode = "overlay" } oms_agent { log_analytics_workspace_id = data.azurerm_log_analytics_workspace.logstash.id } } Key configuration choices: System-assigned managed identity: Eliminates the need for service principal credentials Azure CNI in overlay mode: Provides efficient pod networking without consuming subnet IPs Standard Load Balancer SKU: Enables zone redundancy and higher performance OMS agent integration: Sends cluster metrics to Log Analytics for monitoring The cluster requires network permissions to create the internal Load Balancer: resource "azurerm_role_assignment" "aks-netcontrib" { scope = data.azurerm_virtual_network.primary.id principal_id = azurerm_kubernetes_cluster.primary.identity[0].principal_id role_definition_name = "Network Contributor" } Configuring Logstash Deployment The Logstash deployment uses Kubernetes resources for reliability and scalability. First, create a dedicated namespace: resource "kubernetes_namespace" "logstash" { metadata { name = "logstash" } } The internal Load Balancer service exposes Logstash on a static IP: resource "kubernetes_service" "loadbalancer-logstash" { metadata { name = "logstash-lb" namespace = kubernetes_namespace.logstash.metadata[0].name annotations = { "service.beta.kubernetes.io/azure-load-balancer-internal" = "true" "service.beta.kubernetes.io/azure-load-balancer-ipv4" = "10.0.2.100" "service.beta.kubernetes.io/azure-load-balancer-internal-subnet" = data.azurerm_subnet.kube-lb.name "service.beta.kubernetes.io/azure-load-balancer-resource-group" = data.azurerm_resource_group.rg-network.name } } spec { type = "LoadBalancer" selector = { app = kubernetes_deployment.logstash.metadata[0].name } port { name = "logstash-tls" protocol = "TCP" port = 6514 target_port = 6514 } } } The annotations configure Azure-specific Load Balancer behavior, including the static IP assignment and subnet placement. Securing Logstash with TLS Kubernetes Secrets store the TLS certificate and Logstash configuration: resource "kubernetes_secret" "logstash-ssl" { metadata { name = "logstash-ssl" namespace = kubernetes_namespace.logstash.metadata[0].name } data = { "server.crt" = data.azurerm_key_vault_certificate_data.logstash.pem "server.key" = data.azurerm_key_vault_certificate_data.logstash.key } type = "Opaque" } The certificate data comes directly from Key Vault, maintaining a secure chain of custody. Logstash Container Configuration The deployment specification defines how Logstash runs in the cluster: resource "kubernetes_deployment" "logstash" { metadata { name = "logstash" namespace = kubernetes_namespace.logstash.metadata[0].name } spec { selector { match_labels = { app = "logstash" } } template { metadata { labels = { app = "logstash" } } spec { container { name = "logstash" image = "docker.elastic.co/logstash/logstash:8.17.4" security_context { run_as_user = 1000 run_as_non_root = true allow_privilege_escalation = false } resources { requests = { cpu = "500m" memory = "1Gi" } limits = { cpu = "1000m" memory = "2Gi" } } volume_mount { name = "logstash-config-volume" mount_path = "/usr/share/logstash/pipeline/logstash.conf" sub_path = "logstash.conf" read_only = true } volume_mount { name = "logstash-ssl-volume" mount_path = "/etc/logstash/certs" read_only = true } } } } } } Security best practices include: Running as a non-root user (UID 1000) Disabling privilege escalation Mounting configuration and certificates as read-only Setting resource limits to prevent runaway containers Automatic Scaling Configuration The Horizontal Pod Autoscaler ensures Logstash scales with demand: resource "kubernetes_horizontal_pod_autoscaler" "logstash_hpa" { metadata { name = "logstash-hpa" namespace = kubernetes_namespace.logstash.metadata[0].name } spec { scale_target_ref { kind = "Deployment" name = kubernetes_deployment.logstash.metadata[0].name api_version = "apps/v1" } min_replicas = 1 max_replicas = 30 target_cpu_utilization_percentage = 80 } } This configuration maintains between 1 and 30 replicas, scaling up when CPU usage exceeds 80%. Logstash Pipeline Configuration The Logstash configuration file defines how to process syslog messages: input { tcp { port => 6514 type => "syslog" ssl_enable => true ssl_cert => "/etc/logstash/certs/server.crt" ssl_key => "/etc/logstash/certs/server.key" ssl_verify => false } } output { stdout { codec => rubydebug } kafka { bootstrap_servers => "${name}.servicebus.windows.net:9093" topic_id => "syslog" security_protocol => "SASL_SSL" sasl_mechanism => "PLAIN" sasl_jaas_config => 'org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://${name}.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=${primary_key};EntityPath=syslog";' codec => "json" } } The configuration: Listens on port 6514 for TLS-encrypted syslog messages Outputs to stdout for debugging (visible in container logs) Forwards processed messages to Event Hubs using the Kafka protocol Deploying the Solution With all components configured, deploy the solution using Terraform: Initialize Terraform in your project directory: terraform init Review the planned changes: terraform plan Apply the configuration: terraform apply Connect to the AKS cluster: az aks get-credentials \ --resource-group rg-syslog-prod \ --name aks-syslog-prod Verify the deployment: kubectl -n logstash get pods kubectl -n logstash get svc kubectl -n logstash get hpa Configuring Syslog Sources After deployment, configure your syslog sources to send messages to the Load Balancer: Create a DNS record pointing to the Load Balancer IP (10.0.2.100). For example: syslog.yourdomain.com Configure syslog clients to send RFC 5425 messages over TLS to port 6514 Install the certificate chain on syslog clients if using a private CA or self-signed certificate Example rsyslog configuration for a Linux client: *.* @@syslog.yourdomain.com:6514;RSYSLOG_SyslogProtocol23Format Monitoring and Troubleshooting Monitor the deployment using several methods: View Logstash logs to verify message processing: kubectl -n logstash logs -l app=logstash --tail=50 Check autoscaling status: kubectl -n logstash describe hpa logstash-hpa Monitor in Azure Portal: Navigate to the Log Analytics Workspace to view AKS metrics Check Event Hub metrics to confirm message delivery Review Load Balancer health probes and connection statistics Security Best Practices This deployment incorporates several security measures: TLS encryption: All syslog traffic is encrypted using certificates from Key Vault Network isolation: The internal Load Balancer restricts access to the virtual network Managed identities: No credentials are stored in the configuration Container security: Logstash runs as a non-root user with minimal privileges For production deployments, consider these additional measures: Enable client certificate validation in Logstash for mutual TLS Add Network Security Groups to restrict source IPs Implement Azure Policy for compliance validation Enable Azure Defender for Kubernetes Integration with Azure Services Once syslog data flows into Event Hubs, you can integrate with various Azure services: Azure Sentinel: Configure Data Collection Rules to ingest syslog data for security analytics. See the Azure Sentinel documentation for detailed steps. Azure Data Explorer: Create a data connection to analyze syslog data with KQL queries. Azure Stream Analytics: Process syslog streams in real-time for alerting or transformation. Logic Apps: Trigger workflows based on specific syslog patterns or events. Cost Optimization To optimize costs while maintaining performance: Right-size the AKS node pool based on actual syslog volume Use Azure Spot instances for non-critical environments Configure Event Hub retention based on compliance requirements Enable auto-shutdown for development environments Conclusion This Terraform-based solution provides a robust foundation for collecting syslog data in Azure. The combination of AKS, Logstash, and Event Hubs creates a scalable pipeline that integrates seamlessly with Azure's security and analytics services. The modular design allows easy customization for different environments and requirements. Whether you're collecting logs from a handful of devices or thousands, this architecture scales to meet your needs while maintaining security and reliability. For next steps, consider implementing additional Logstash filters for data enrichment, setting up automated certificate rotation, or expanding the solution to collect other log formats. The flexibility of this approach ensures it can grow with your organization's logging requirements.Step by Step: 2-Tier PKI Lab
Purpose of this blog Public Key Infrastructure (PKI) is the backbone of secure digital identity management, enabling encryption, digital signatures, and certificate-based authentication. However, neither setting up a PKI nor management of certificates is something most IT pros do on a regular basis and given the complexity and vastness of the subject it only makes sense to revisit the topic from time to time. What I have found works best for me is to just set up a lab and get my hands dirty with the topic that I want to revisit. One such topic that I keep coming back to is PKI - be it for creating certificate templates, enrolling clients, or flat out creating a new PKI itself. But every time I start deploying a lab or start planning a PKI setup, I end up spending too much time sifting through the documentations and trying to figure out why my issuing certificate authority won't come online! To make my life easier I decided to create a cheatsheet to deploy a simple but secure 2-tier PKI lab based on industry best practices that I thought would be beneficial for others like me, so I decided to polish it and make it into a blog. This blog walks through deploying a two-tier PKI hierarchy using Active Directory Certificate Services (AD CS) on Windows Server: an offline Root Certification Authority (Root CA) and an online Issuing Certification Authority (Issuing CA). We’ll cover step-by-step deployment and best practices for securing the root CA, conducting key ceremonies, and maintaining Certificate Revocation Lists (CRLs). Overview: Two-Tier PKI Architecture and Components In a two-tier PKI, the Root CA sits at the top of the trust hierarchy and issues a certificate only to the subordinate Issuing CA. The Root CA is kept offline (disconnected from networks) to protect its private key and is typically a standalone CA (not domain-joined). The Issuing CA (sometimes called a subordinate or intermediate CA) is kept online to issue certificates to end-entities (users, computers, services) and is usually an enterprise CA integrated with Active Directory for automation and certificate template support. Key components: Offline Root CA: A standalone CA, often on a workgroup server, powered on only when necessary (initial setup, subordinate CA certificate signing, or periodic CRL publishing). By staying offline, it is insulated from network threats. Its self-signed certificate serves as the trust anchor for the entire PKI. The Root CA’s private key must be rigorously protected (ideally by a Hardware Security Module) because if the root is compromised, all certificates in the hierarchy are compromised. Online Issuing CA: An enterprise subordinate CA (domain-joined) that handles day-to-day certificate issuance for the organization. It trusts the Root CA (via the root’s certificate) and is the one actually responding to certificate requests. Being online, it must also be secured, but its key is kept online for operations. Typically, the Issuing CA publishes certificates and CRLs to Active Directory and/or HTTP locations for clients to download. The following diagram shows the simplified view of this implementations: The table below summarizes the roles and differences: Aspect Offline Root CA Online Issuing CA Role Standalone Root CA (workgroup) Enterprise Subordinate CA (domain member) Network Connectivity Kept offline (powered off or disconnected when not issuing) Online (running continuously to serve requests) Usage Signs only one certificate (the subordinate CA’s cert) and CRLs Issues end-entity certificates (users, computers, services) Active Directory Not a member of AD domain; doesn’t use templates or auto-enrollment Integrated with AD DS; uses certificate templates for streamlined issuance Security Extremely high: physically secured, limited access, often protected by HSM Very High: server hardened, but accessible on network; HSM recommended for private key CRL Publication Manual. Admin must periodically connect, generate, and distribute CRL. Delta CRLs usually disabled. Automatic. Publishes CRLs to configured CDP locations (AD DS, HTTP) at scheduled intervals. Validity Period Longer (e.g. 5-10+ years for the CA certificate) to reduce frequency of renewal. Shorter (e.g. 2 years) to align with organizational policy; renewed under the root when needed. In this lab setup, we will create a Contoso Root CA (offline) and a Contoso Issuing CA (online) as an example. This mirrors real-world best practices which is to "deploy a standalone offline root CA and an online enterprise subordinate CA”. Deploying the Offline Root CA Setting up the offline Root CA involves preparing a dedicated server, installing AD CS, configuring it as a root CA, and then securing it. We’ll also configure certificate CDP/AIA (CRL Distribution Point and Authority Information Access) locations so that issued certificates will point clients to the correct locations to fetch the CA’s certificate and revocation list. Step 1: Prepare the Root CA Server (Offline) Provision an isolated server: Install a Windows Server OS (e.g., Windows Server 2022) on the machine designated to be the Root CA. Preferably on a portable enterprise grade physical server that can be stored in a safe. Do not join this server to any domain – it should function in a Workgroup to remain independent of your AD forest. System configuration: Give the server a descriptive name (e.g., ROOTCA) and assign a static IP (even though it will be offline, a static IP helps when connecting it temporarily for management). Install the latest updates and security patches while it’s still able to go online. Lock down network access: Once setup is complete, disable or unplug network connections. If the server must remain powered on for any reason, ensure all unnecessary services/ports are disabled to minimize exposure. In practice, you will keep this server shut down or physically disconnected except when performing CA maintenance. Step 2: Install the AD CS Role on the Root CA Add the Certification Authority role: On the Root CA server, open Server Manager and add the Active Directory Certificate Services role. During the wizard, select the Certification Authority role service (no need for web enrollment or others on the root). Proceed through the wizard and complete the installation. You can also install the CA role and management tools via PowerShell: Install-WindowsFeature AD-Certificate -IncludeManagementToolsThis Role Services: Choose Certification Authority. Setup Type: Select Standalone CA (since this root CA is not domain-joined). CA Type: Select Root CA. Private Key: Choose “Create a new private key.” Cryptography: If using an HSM, select the HSM’s Cryptographic Service Provider (CSP) here; otherwise use default. Choose a strong key length (e.g., 2048 or 4096 bits) and a secure hash algorithm (SHA-256 or higher). CA Name: Provide a common name for the CA (e.g., “Contoso Root CA”). This name will appear in issued certificates as the Issuer. Avoid using a machine DNS name here for security – pick a name without revealing the server’s actual hostname. Validity Period: Set a long validity (e.g., 10 years) for the root CA’s self-signed certificate. A decade is common for enterprise roots, reducing how often you must touch the offline CA for renewal. Database: Specify locations for the CA database and logs (the defaults are fine for a lab). Review settings and complete the configuration. This process will generate the root CA’s key pair and self-signed certificate, establishing the Root CA.Post-install configuration: After the binary installation, click Configure Active Directory Certificate Services (a notification in Server Manager). In the configuration wizard: You can also perform this configuration via PowerShell in one line: Install-AdcsCertificationAuthority ` -CAType StandaloneRootCA ` -CryptoProviderName "YourHSMProvider" ` -HashAlgorithmName SHA256 -KeyLength 2048 ` -CACommonName "Contoso Root CA" ` -ValidityPeriod Years -ValidityPeriodUnits 10 This would set up a standalone Root CA named "Contoso Root CA" with a 2048-bit key on an HSM provider, valid for 10 years. Step 3: Integrate an HSM (Optional but Recommended) If your lab has a Hardware Security Module, use it to secure the Root CA’s keys. Using an HSM provides a dedicated, tamper-resistant storage for CA private keys and can further protect against key compromise. To integrate: Install the HSM vendor’s software and drivers on the Root CA server. Initialize the HSM and create a security world or partition as per the vendor instructions. Before or during the CA configuration (Step 2 above), ensure the HSM is ready to generate/store the key. When running the AD CS configuration, select the HSM’s CSP/KSP for the cryptographic provider so that the CA’s private key is generated on the HSM. Secure any HSM admin tokens or smartcards. For a root CA, you might employ M of N key splits – requiring multiple key custodians to collaborate to activate the HSM or key – as part of the key ceremony (discussed later). (If an HSM is not available, the root key will be stored on the server’s disk. At minimum, protect it with a strong admin passphrase when prompted, and consider enabling the option to require administrator interaction (e.g., a password) whenever the key is accessed.) Step 4: Configure CA Extensions (CDP/AIA) It’s critical to configure how the Root CA publishes its certificate and revocation list, since the root is offline and cannot use Active Directory auto-publishing. Open the Certification Authority management console (certsrv.msc), right-click the CA name > Properties, and go to the Extensions tab. We will set the CRL Distribution Points (CDP) and Authority Information Access (AIA) URLs: CRL Distribution Point (CDP): This is where certificates will tell clients to fetch the CRL for the Root CA. By default, a standalone CA might have a file:// path or no HTTP URL. Click Add and specify an HTTP URL that will be accessible to all network clients, such as: http://<IssuingCA_Server>/CertEnroll/<CaName><CRLNameSuffix><DeltaCRLAllowed>.crl For example, if your issuing CA’s server name is ISSUINGCA.contoso.local, the URL might be http://issuingca.contoso.local/CertEnroll/Contoso%20Root%20CA.crl This assumes the Issuing CA (or another web server) will host the Root CA’s CRL in the CertEnroll directory. Check the boxes for “Include in the CDP extension of issued certificates” and “Include in all CRLs. Clients use this to find Delta CRLs” (you can uncheck the delta CRL publication on the root, as we won’t use delta CRLs on an offline root). Since the root CA won’t often revoke its single issued cert (the subordinate CA), delta CRLs aren’t necessary. Note: If your Active Directory is in use and you want to publish the Root CA’s CRL to AD, you can also add an ldap:///CN=... path and check “Publish in Active Directory”. However, publishing to AD from an offline CA must be done manually using the following command when the root is temporarily connected. certutil -dspublish Many setups skip LDAP for offline roots and rely on HTTP distribution. Authority Information Access (AIA): This is where the Root CA’s certificate will be published for clients to download (to build certificate chains). Add an HTTP URL similarly, for example: http://<IssuingCA_Server>/CertEnroll/<ServerDNSName>_<CaName><CertificateName>.crt This would point to a copy of the Root CA’s certificate that will be hosted on the issuing CA web server. Check “Include in the AIA extension of issued certificates”. This way, any certificate signed by the Root CA (like your subordinate CA’s cert) contains a URL where clients can fetch the Root CA’s cert if they don’t already have it. After adding these, remove any default entries that are not applicable (e.g., LDAP if the root isn’t going to publish to AD, or file paths that won’t be used by clients). These settings ensure that certificates issued by the Root CA (in practice, just the subordinate CA’s certificate) will carry the correct URLs for chain building and revocation checking. Step 5: Back Up the Root CA and Issue the Subordinate Certificate With the Root CA configured, we need to issue a certificate for the Issuing CA (subordinate). We’ll perform that in the next section from the Issuing CA’s side via a request file. Before taking the root offline, ensure you: Back up the CA’s private key and certificate: In the Certification Authority console, or via the CA Backup wizard, export the Root CA’s key pair and CA certificate. Protect this backup (store it offline in a secure location, e.g., on encrypted removable media in a safe). This backup is crucial for disaster recovery or if the Root CA needs to be migrated or restored. Save the Root CA Certificate: You will need the Root CA’s public certificate (*.crt) to distribute to other systems. Have it exported (Base-64 or DER format) for use on the Issuing CA and for clients. Initial CRL publication: Manually publish the first CRL so that it can be distributed. Open an elevated Command Prompt on the Root CA and run: certutil -crl This generates a new CRL file (in the CA’s configured CRL folder, typically %windir%\system32\CertSrv\CertEnroll). Take that CRL file and copy it to the designated distribution point (for example, to the CertEnroll directory on the Issuing CA’s web server, as per the HTTP URL configured). If using Active Directory for CRL distribution, you would also publish it to AD now (e.g., certutil -dspublish -f RootCA.crl on a domain-connected machine). In most lab setups, copying to an HTTP share is sufficient. With these tasks done, the Root CA is ready. At this point, disconnect or power off the Root CA and store it securely – it should remain offline except when it’s absolutely needed (like publishing a new CRL or renewing the subordinate CA’s certificate in the far future). Keeping the root CA offline maximizes its security by minimizing exposure to compromise. Best Practices for Securing the Root CA: The Root CA is the trust anchor, so apply stringent security practices: Physical security: Store the Root CA machine in a locked, secure location. If it’s a virtual machine, consider storing it on a disconnected hypervisor or a USB drive locked in a safe. Only authorized PKI team members should have access. An offline CA should be treated like crown jewels – offline CAs should be stored in secure locations. Minimal exposure: Keep the Root CA powered off and disconnected when not in use. It should not be left running or connected to any network. Routine operations (like issuing end-entity certs) should never involve the root. Admin access control: Limit administrative access on the Root CA server. Use dedicated accounts for PKI administration. Enable auditing on the CA for any changes or issuance events. No additional roles or software: Do not use the Root CA server for any other function (no web browsing, no email, etc.). Fewer installed components means fewer potential vulnerabilities. Protect the private key: Use an HSM if possible; if not, ensure the key is at least protected by a strong password and consider splitting knowledge of that password among multiple people (so no single person can activate the CA). Many organizations opt for an offline root key ceremony (see below) to generate and handle the root key with multiple witnesses and strict procedures. Keep system time and settings consistent: If the Root CA is powered off for long periods, ensure its clock is accurate whenever it is started (to avoid issuing a CRL or certificate with a wrong date). Don’t change the server name or CA name after installation (doing so invalidates issued certs). Periodic health checks: Even though offline, plan to turn on the Root CA at a secure interval (e.g., semi-annually or annually) to perform tasks like CRL publishing and system updates. Make sure to apply OS security updates during these maintenance windows, as offline does not mean immune to vulnerabilities (especially if it ever connects to a network for CRL publication or uses removable media). Deploying the Online Issuing CA Next, set up the Issuing CA server which will actually issue certificates to end entities in the lab. This server will be domain-joined (if using AD integration) and will obtain its CA certificate from the Root CA we just configured. Step 1: Prepare the Issuing CA Server Provision the server: Install Windows Server on a new machine (or VM) that will be the Issuing CA. Join this server to the Active Directory domain (e.g., Contoso.local). Being an enterprise CA, it needs domain membership to publish templates and integrate with AD security groups. Rename the server to something descriptive like ISSUINGCA for clarity. Assign a static IP and ensure it can communicate on the network. IIS for web enrollment (optional): If you plan to use the Web Enrollment or Certificate Enrollment Web Services, ensure IIS is installed. (The AD CS installation wizard can add it if you include those role services.) For this guide, we will include the Web Enrollment role so that the CertEnroll directory is set up for hosting certificate and CRL files. Step 2: Install AD CS Role on Issuing CA On the Issuing CA server, add the Active Directory Certificate Services role via Server Manager or PowerShell. This time, select both Certification Authority and Certification Authority Web Enrollment role services (Web Enrollment will set up the HTTP endpoints for certificate requests if needed). For example, using PowerShell: Install-WindowsFeature AD-Certificate, ADCS-Web-Enrollment -IncludeManagementTools After installation, launch the AD CS configuration wizard: Role Services: Choose Certification Authority (and Web Enrollment if prompted). Setup Type: Select Enterprise CA (since this CA will integrate with AD DS). CA Type: Select Subordinate CA (this indicates it will get its cert from an existing root CA). Private Key: Choose “Create a new private key” (we’ll generate a new key pair for this CA). Cryptography: If using an HSM here as well, select the HSM’s CSP/KSP for the issuing CA’s key. Otherwise, choose a strong key length (2048+ bits, SHA256 or better for hash). CA Name: Provide a name (e.g., “Contoso Issuing CA”). This name will appear as the Issuer on certificates it issues. Certificate Request: The wizard will ask how you want to get the subordinate CA’s certificate. Choose “Save a certificate request to file”. Specify a path, e.g., C:\CertRequest\issuingCA.req. The wizard will generate a request file that we need to take to the Root CA for signing. (Since our Root CA is offline, this file transfer might be via secure USB or a network share when the root is temporarily online.) CA Database: Choose locations or accept defaults for the certificate DB and logs. Finish the configuration wizard, which will complete pending because the CA doesn’t have a certificate yet. The AD CS service on this server won’t start until we import the issued cert from the root. Step 3: Integrate HSM on Issuing CA (Optional) If available, repeat the HSM setup on the Issuing CA: install HSM drivers, initialize it, and generate/secure the key for the subordinate CA on the HSM. Ensure you chose the HSM provider during the above configuration so that the issuing CA’s private key is stored in the HSM. Even though this CA is online, an HSM still greatly enhances security by protecting the private key from extraction. The issuing CA’s HSM may not require multiple custodians to activate (as it needs to run continuously), but should still be physically secured. Step 4: Obtain the Issuing CA’s Certificate from the Root CA Now we have a pending request (issuingCA.req) for the subordinate CA. To get its certificate: Transport the request to the Root CA: Copy the request file to the offline Root CA (via secure means – e.g., formatted new USB stick). Start up the Root CA (in a secure, offline setting) and open the Certification Authority console. Submit the request on Root CA: Right-click the Root CA in the CA console -> All Tasks -> Submit new request, and select the .req file. The request will appear in the Pending Requests on the root. Issue the subordinate CA certificate: Find the pending request (it will list the Issuing CA’s name). Right-click and choose All Tasks > Issue. The subordinate CA’s certificate is now issued by the Root CA. Export the issued certificate: Still on the Root CA, go to Issued Certificates, find the newly issued subordinate CA cert (you can identify it by the Request ID or by the name). Right-click it and choose Open or All Tasks > Export to get the certificate in a file form. If using the console’s built-in “Export” it might only allow binary; alternatively use the certutil command: certutil -dup <RequestID> .\ContosoIssuingCA.cer or simply open and copy to file. Save the certificate as issuingCA.cer. Also make sure you have a copy of the Root CA’s certificate (if not already done). Publish Root CA cert and CRL as needed: Before leaving the Root CA, you may also want to ensure the Root’s own certificate and latest CRL are available to the issuing CA and clients. If not already done in Step 5 of root deployment, export the Root CA cert (DER format) and copy the CRL file. You might use certutil -crl again if some time has passed since initial CRL. Now take the issuingCA.cer file (and root cert/CRL files) and move them back to the Issuing CA server. Step 5: Install the Issuing CA’s Certificate and Complete Configuration On the Issuing CA server (which is still waiting for its CA cert): Install the subordinate CA certificate: In Server Manager or the Certification Authority console on the Issuing CA, there should be an option to “Install CA Certificate” (if the AD CS configuration wizard is still open, it will prompt for the file; or otherwise, in the CA console right-click the CA name > All Tasks > Install CA Certificate). Provide the issuingCA.cer file obtained from the root. This will install the CA’s own certificate and start the CA service. The Issuing CA is now operational as a subordinate CA. Alternatively, use PowerShell: certutil -installcert C:\CertRequest\issuingCA.cer This installs the cert and associates it with the pending key. Trust the Root CA certificate: Because the Issuing CA is domain-joined, when you install the subordinate cert, it might automatically place the Root CA’s certificate in the Trusted Root Certification Authorities store on that server (and possibly publish it to AD). If not, you should manually install the Root CA’s certificate into the Trusted Root CA store on the Issuing CA machine (using the Certificates MMC or certutil -addstore -f Root rootCA.cer). This step prevents any “chain not trusted” warnings on the Issuing CA and ensures it trusts its parent. In an enterprise environment, you would also distribute the root certificate to all client machines (e.g., via Group Policy) so that they trust the whole chain. Import Root CRL: Copy the Root CA’s CRL (*.crl file) to the Issuing CA’s CRL distribution point location (e.g., C:\Windows\System32\CertSrv\CertEnroll\ if that’s the directory served by the web server). This matches the HTTP URL we configured on the root. Place the CRL file there and ensure it is accessible (the Issuing CA’s IIS might need to serve static .crl files; often, if Web Enrollment is installed, the CertEnroll folder is under C:\Inetpub\wwwroot\CertEnroll). At this point, the subordinate CA and any client hitting the HTTP URL can retrieve the root’s CRL. The subordinate CA is now fully established. It holds a certificate issued by the Root CA (forming a complete chain of trust), and it’s ready to issue end-entity certificates. Step 6: Configure Issuing CA Settings and Start Services Start the Certificate Services: If the CA service (CertSvc) isn’t started automatically, start or restart it. On PowerShell: Restart-Service certsvc The CA should show as running in the CA console with the name “Contoso Issuing CA” (or your chosen name). Configure Certificate Templates: Because this is an Enterprise CA, it can utilize certificate templates stored in Active Directory to simplify issuing common cert types (user auth, computer auth, web server SSL, etc.). By default, some templates (e.g., User, Computer) are available but not issued. In the Certification Authority console under Certificate Templates, you can choose which templates to issue (e.g., right-click > New > Certificate Template to Issue, then select templates like “User” or “Computer”). This lab guide doesn’t require specific templates but know that only Enterprise CAs can use templates. Templates define the policies and settings (cryptography, enrollment permissions, etc.) for issued certificates. Ensure you enable only the templates needed and configure their permissions appropriately (e.g., allow the appropriate groups to enroll). Set CRL publishing schedule: The Issuing CA will automatically publish its own CRL (for certificates it issues) at intervals. You can adjust the CRL and Delta CRL publication interval in the CA’s Properties > CRL Period. A common practice is a small base CRL period (e.g., 1 week or 2 weeks) for issuing CAs, because they may revoke user certs more frequently; and enable Delta CRLs (published daily) for timely revocation information. Make sure the CDP/AIA for the Issuing CA itself are properly configured too (the wizard usually sets LDAP and HTTP locations, but verify in the Extensions tab). In a lab, the default settings are fine. Web Enrollment (if installed): You can verify the web enrollment by browsing to http://<IssuingCA>/certsrv. This web UI allows browser-based certificate requests. It’s a legacy interface mostly, but for testing it can be used if your clients aren’t domain-joined or if you want a manual request method. In modern use, the Certificate Enrollment Web Service/Policy roles or auto-enrollment via Group Policy are preferred for remote and automated enrollment. At this stage, your PKI is operational: the Issuing CA trusts the offline Root CA and can issue certificates. The Root CA can be kept offline with confidence that the subordinate will handle all regular work. Validation and Testing of the PKI It’s important to verify that the PKI is configured correctly: Check CA status: On the Issuing CA, open the Certification Authority console and ensure no errors. Verify that the Issuing CA’s certificate shows OK (no red X). On the Root CA (offline most of the time), you can use the Pkiview.msc snap-in (Microsoft PKI Health Tool) on a domain-connected machine to check the health of the PKI. This tool will show if the CDPs/AIA are reachable and if certificates are properly published. Trust chain on clients: On a domain-joined client PC, the Root CA certificate should be present in the Trusted Root Certification Authorities store (if the Issuing CA was installed as Enterprise CA, it likely published the root cert to AD automatically; you can also distribute it via Group Policy or manually). The Issuing CA’s certificate should appear in the Intermediate Certification Authorities store. This establishes the chain of trust. If not, import the root cert into the domain’s Group Policy for Trusted Roots. A quick test: on a client, run certutil -config "ISSUINGCA\\Contoso Issuing CA" -ping to see if it can contact the CA (or use the Certification Authority MMC targeting the issuing CA). Enroll a test certificate: Try to enroll for a certificate from the Issuing CA. For instance, from a domain-joined client, use the Certificates MMC (in Current User or Computer context) and initiate a certificate request for a User or Computer certificate (depending on templates issued). If auto-enrollment is configured via Group Policy for a template, you can simply log on a client and see if it automatically receives a certificate. Alternatively, use the web enrollment page or certreq command to submit a request. The request should be approved and a certificate issued by "Contoso Issuing CA". After enrollment, inspect the issued certificate: it should chain up to "Contoso Root CA" without errors. Ensure that the certificate’s CDP points to the URL we set (and try to browse that URL to see the CRL file), and that the AIA points to the root cert location. Revocation test (optional): To test CRL behavior, you could revoke a test certificate on the Issuing CA (using the CA console) and publish a new CRL. On the client, after updating the CRL, the revoked certificate should show as revoked. For the Root CA, since it shouldn’t issue end-entity certs, you wouldn’t normally revoke anything except potentially the subordinate CA’s certificate (which would be a drastic action in case of compromise). By issuing a test certificate and validating the chain and revocation, you confirm that your two-tier PKI lab is functioning correctly. Maintaining the PKI: CRLs, Key Ceremonies, and Security Procedures Deploying the PKI is only the beginning. Proper maintenance and operational procedures are crucial to ensure the PKI remains secure and reliable over time. Periodic CRL Updates for the Offline Root: The Root CA’s CRL has a defined validity period (set during configuration, often 6 or 12 months for offline roots). Before the CRL expires, the Root CA must be brought online (in a secure environment) to issue a new CRL. It’s recommended to schedule CRL updates periodically (e.g., semi-annually) to prevent the CRL from expiring. An expired CRL can cause certificate chain validation to fail, potentially disrupting services. Typically, organizations set the offline root CRL validity so that publishing 1-2 times a year is sufficient. When the time comes: Start the Root CA (ensuring the system clock is correct). Run certutil -crl to issue a fresh CRL. Distribute the new CRL: copy it to the HTTP CDP location (overwrite the old file) and, if applicable, use certutil -dspublish -f RootCA.crl to update it in Active Directory. Verify that the new CRL’s next update date is extended appropriately (e.g., another 6 months out). Clients and the Issuing CA will automatically pick up the new CRL when checking for revocation. (The Issuing CA, if configured, might cache the root CRL and need a restart or certutil -setreg ca\CRLFlags +CRLF_REVCHECK_IGNORE_OFFLINE tweak if the root CRL expires unexpectedly. Keeping the schedule prevents such issues.) Issuing CA CRL and OCSP: The Issuing CA’s CRLs are published automatically as it is online. Ensure the IIS or file share hosting the CRL is accessible. Optionally, consider setting up an Online Responder (OCSP) for real-time status checking, especially if CRLs are large or you need faster revocation information. OCSP is another AD CS role service that can be configured on the issuing CA or another server to answer certificate status queries. This might be beyond a simple lab, but it’s worth mentioning for completeness. Key Ceremonies and Documentation: For production environments (and good practice even in labs), formalize the process of handling CA keys in a Key Ceremony. A key ceremony is a carefully controlled process for activities like generating the Root CA’s key pair, installing the CA, and signing subordinate certificates. It often involves multiple people to ensure no single person has unilateral control (principle of dual control) and to witness the process. Best practices for a Root CA key ceremony include: Advance Planning: Create a step-by-step script of the ceremony tasks. Include who will do what, what materials are needed (HSMs, installation media, backup devices, etc.), and the order of operations. Multiple trusted individuals present: Roles might include a Ceremony Administrator (leads the process), a Security Officer (responsible for HSM or key material handling), an Auditor (to observe and record), etc. This prevents any one person from manipulating the process and increases trust. Secure environment: Conduct the ceremony in a secure location (e.g., a locked room) free of recording devices or unauthorized personnel. Ensure the Root CA machine is isolated (no network), and ideally that BIOS/USB access controls are in place to prevent any malware. Generate keys with proper controls: If using an HSM, initialize and generate the key with the required number of key custodians each providing part of the activation material (e.g., smartcards or passphrases). Immediately back up the HSM partition or key to secure media (requiring the same custodians to restore). Sign subordinate CA certificate: As part of the ceremony, once the root key is ready, sign the subordinate’s request. This might also be a witnessed step. Document every action: Write down each command run, each key generated, serial numbers of devices used, and have all participants sign an acknowledgment of the outcomes. Also record the fingerprints of the generated Root CA certificate and any subordinate certificate to ensure they are exactly as expected. Secure storage: After the ceremony, store the Root CA machine (if it’s a laptop or VM) and HSM tokens in a tamper-evident bag or safe. The idea is to make it evident if someone tries to access the root outside of an authorized ceremony. While a full key ceremony might be overkill for a small lab, understanding these practices is important. Even in a lab, you can simulate some aspects (for learning), like documenting the procedure of taking the root online to sign the request and then locking it away. These practices greatly increase the trust in a production PKI by ensuring transparency and accountability for critical operations. Backup and Recovery Plans: Both CAs’ data should be regularly backed up: For the Root CA: since it’s rarely online, backup after any change. Typically, you’d back up the CA’s private key and certificate once (right after setup or any renewal). Store this securely offline (separate from the server itself). Also back up the CA database if it ever issues more than one cert (for root it might not issue many). For the Issuing CA: schedule automated backups of the CA database and private key. You can use the built-in certutil -backup or Windows Server Backup (which is aware of the AD CS database). Keep backups secure and test restoration procedures. Having a documented recovery procedure for the CA is crucial for continuity. Also consider backup of templates and any scripts. Maintain spare hardware or VMs in case you need to restore the CA on new hardware (especially for the root, having a procedure to restore on a new machine if the original is destroyed). Security maintenance: Apply OS updates to the CAs carefully. For the offline root, patch it offline if possible (offline servicing or connecting it briefly to a management network). For the issuing CA, treat it as a critical infrastructure server: limit its exposure (firewall it so only required services are reachable), monitor its event logs (enable auditing for Certificate Services events, which can log each issuance and revocation), and employ anti-malware tools with caution (whitelisting the CA processes to avoid interference). Also, periodically review the CA’s configuration and certificate templates to ensure they meet current security standards (for example, deprecate any weak cryptography or adjust validity periods if needed). By following these maintenance steps and best practices, your two-tier PKI will remain secure and trustworthy over time. Remember that PKI is not “set and forget” – it requires operational diligence, but the payoff is a robust trust infrastructure for your organization’s security. Additional AD CS Features and References Active Directory Certificate Services provides more capabilities than covered in this basic lab. Depending on your needs, you might explore: Certificate Templates: We touched on templates; they are a powerful feature on Enterprise CAs to enforce standardized certificate settings. Administrators can create custom templates for various use cases (SSL, S/MIME email, code signing) and control enrollment permissions. Understanding template versions and permissions is key for enterprise deployments. (Refer to Microsoft’s documentation on Certificate template concepts in Windows Server for details on how templates work and can be customized.) Web Services for Enrollment: In scenarios with remote or non-domain clients, AD CS offers the Certificate Enrollment Web Service (CES) and Certificate Enrollment Policy Web Service (CEP) role services. These allow clients to fetch enrollment policy information and request certificates over HTTP or HTTPS, even when not connected directly to the domain. They work with the certificate templates to enable similar auto-enrollment experiences over the web. See Microsoft’s guides on the Certificate Enrollment Web Service overview and Certificate Enrollment Policy Web Service overview for when to use these. Network Device Enrollment Service (NDES): This AD CS role service implements the Simple Certificate Enrollment Protocol (SCEP) to allow devices like routers, switches, and mobile devices to obtain certificates from the CA without domain credentials. NDES acts as a proxy (Registration Authority) between devices and the CA, using one-time passwords for authentication. If you need to issue certificates to network equipment or MDM-managed mobile devices, NDES is the solution. Microsoft Docs provide a Network Device Enrollment Service(NDES) overview and even details on using a policy module with NDES for advanced scenarios (like customizing how requests are processed or integrating with custom policies). Online Responders (OCSP): As mentioned, an Online Responder can be configured to answer revocation status queries more efficiently than CRLs, especially useful if your CRLs grow large or you have high-volume certificate validation (VPNs, etc.). AD CS’s Online Responder role service can be installed on a member server and configured with the OCSP Response Signing certificate from your Issuing CA. Monitoring and Auditing: Windows Servers have options to audit CA events. Enabling auditing can log events such as certificate issuance, revocation, or changes to the CA configuration. These logs are important in enterprise PKI to track who did what (for compliance and security forensics). Also, tools like the PKI Health Tool (pkiview.msc) and PowerShell cmdlets (like Get-CertificationAuthority, Get-CertificationAuthorityCertificate) can help monitor the health and configuration of your CAs. Conclusion By following this guide, you have set up a secure two-tier PKI environment consisting of an offline Root CA and an online Issuing CA. This design, which uses an offline root, is considered a security best practice for enterprise PKI deployments because it reduces the risk of your root key being compromised. With the offline Root CA acting as a hardened trust anchor and the enterprise Issuing CA handling day-to-day certificate issuance, your lab PKI can issue certificates for various purposes (HTTPS, code signing, user authentication, etc.) in a way that models real-world deployments. As you expand this lab or move to production, always remember that PKI security is as much about process as technology. Applying strict controls to protect CA keys, keeping software up to date, and monitoring your PKI’s health are all part of the journey. For further reading and official guidance, refer to these Microsoft documentation resources: 📖 AD CS PKI Design Considerations: PKI design considerations using Active Directory Certificate Services in Windows Server helps in planning a PKI deployment (number of CAs, hierarchy depth, naming, key lengths, validity periods, etc.). This is useful to read when adapting this lab design to a production environment. It also covers configuring CDP/AIA and why offline roots usually don’t need delta CRLs. 📖 AD CS Step-by-Step Guides: Microsoft’s Test Lab Guide Test Lab Guide: Deploying an AD CS Two-Tier PKI Hierarchy walk through a similar scenario.