updates
809 TopicsHigh-Fidelity Network Observability at Scale— ACNS Metrics Filtering and Log Aggregation Now GA
We are thrilled to announce that Advanced Container Networking Services (ACNS) for Azure Kubernetes Service (AKS) now delivers two powerful observability features in General Availability: container network metrics filtering and container network log filtering and aggregation. Together, these capabilities set a new standard for Kubernetes network observability, giving you high-fidelity visibility at dramatically lower cost and noise. These capabilities fundamentally redefine how network observability works at scale while delivering up to 97% cost reduction. Why this is a Milestone? Most Kubernetes observability solutions face a fundamental tension: collect everything and drown in noise and cost, or sample and miss the signals that matter. ACNS breaks that tradeoff. With this release, Azure becomes the first cloud provider to deliver on-node metrics filtering and flow log aggregation for Kubernetes networking, capabilities now also contributed to the upstream Hubble project, making them available to the broader open-source community. For AKS customers running Cilium-based clusters, this means: Every flow you care about is captured. Everything else is dropped at the source. Log volume is compressed by up to 45% through aggregation, without losing security verdicts or error context. Costs scale with what you monitor, not with cluster size. What’s been improved in ACNS observability? This release introduces two capabilities that work together: container network metrics filtering and container network log filtering and aggregation. Both are available on AKS clusters with the Cilium data plane and give you precise controls to keep observability costs predictable while maintaining the visibility you need. Container Network Metrics Filtering Container network metrics are generated for all pods by default whenever ACNS is enabled. With metrics filtering, you now control what gets collected at the point of ingestion, on the node, before anything is scraped or transmitted. A single ContainerNetworkMetric CRD per cluster defines which metric types (dns, flow, tcp, drop), namespaces, pod labels, and protocols to ingest. It supports both include and exclude filters, so you can maintain broad collection while carving out specific workloads or namespaces. Anything that doesn't match is dropped on the node. Changes reconcile in a few seconds, with no Cilium agent or Prometheus restarts required. Container Network Log Filtering and Aggregation Unlike metrics, container network logs are not generated automatically. You start capturing network flows only after applying a ContainerNetworkLog CRD that defines exactly which traffic to capture-by namespace, pod, service, protocol, or verdict. Only matching flows are logged, giving you a precise, targeted view rather than a fire hose. This is where Azure's first-to-market innovation comes in. Flow log aggregation, now built into ACNS and contributed upstream to Hubble for the open-source community, groups similar flows into summarized records every 30 seconds. The result is dramatically reduced data volume while preserving security verdicts, service identity, and error context. What previously required custom post-processing pipelines is now built directly into the platform before storage costs are incurred. Every matched flow log captures: source and destination pods, namespaces, ports, protocols, traffic direction, and policy verdicts. Logs are stored in a Log Analytics workspace (ContainerNetworkLogs table) with a choice of using the Analytics or Basic tier. Built-in Azure portal dashboards are available for both tiers. Logs can also be exported to external log collectors such as Splunk or Datadog. First to Market: Azure and the upstream Hubble Contribution ACNS's filtering and aggregation capabilities were engineered from the ground up to solve real production observability challenges at scale. Rather than keeping this innovation proprietary, Azure contributed the log aggregation and filtering capabilities to the upstream Hubble project, the observability layer of the Cilium ecosystem. This means: AKS customers get a fully managed, Azure-native experience with portal dashboards, Log Analytics integration, and Grafana visualization, out of the box. The broader open-source community gains access to the same filtering and aggregation primitives through upstream Hubble. Azure is the first to ship this capability in a managed Kubernetes service, and the first to give it back to the community. Key Benefits 💰 Lower observability cost. Metrics filtering drops unwanted data on the node before Prometheus ever scrapes it. Flow log aggregation compresses log data by up to 97% in lab testing. Your cost scales with what you choose to monitor, not with cluster size. 📉 Less noise, more signal. Metrics filtering carves out the namespaces and workloads that matter, so dashboards show only relevant signals. Log filters scope collection to specific pods and verdicts. Engineers start every investigation with data that's already relevant. ⚡ Faster root-cause isolation. Every metric carries source and destination pod context. Targeted flow logs add the forensic detail, which policy, destination, or port is involved. Together, they cut mean time to resolution from hours of guesswork to minutes of structured investigation. 🔒 Full signal, zero gaps. ACNS doesn't sample. Within the scope you define, every flow is captured and every pattern is preserved. Aggregation compresses volume without losing security verdicts or error context. Who Benefits Platform engineers managing multi-tenant clusters can scope data collection per namespace, so each team gets visibility into their own traffic without contributing to a shared cost pool. SREs can isolate packet drops, TCP resets, or DNS failures to a specific workload in minutes, starting with data that's already scoped to what matters. Decision-makers evaluating observability spend get predictable, controllable ingestion costs that scale with intent, not infrastructure size. How to optimize ACNS metrics and logs with filtering? Enable ACNS on your AKS cluster with the Cilium data plane: az aks create --enable-acns Or on an existing cluster: az aks update --resource-group $RESOURCE_GROUP --name $CLUSTER --enable-acns Apply a ContainerNetworkMetric CRD to filter which metrics are collected on each node. Start by excluding noisy system namespaces, then scope to business-critical workloads. Apply a ContainerNetworkLog CRD to define which flows to capture. Enable Azure Monitor integration with --enable-container-network-logs to send logs to a Log Analytics workspace, or export logs from the node to an external logging system such as Splunk or Datadog. Check your dashboards. Open your cluster in the Azure portal and go to Monitor > Insights > Networking for bytes, drops, DNS errors, and flows. For flow logs, use the built-in Azure portal dashboards available for both Basic and Analytics tiers. Conclusion Kubernetes network observability has long meant choosing between visibility and cost. With container network metrics filtering and log filtering and aggregation now GA in ACNS and contributed to upstream Hubble for the open-source community, that tradeoff is gone. Azure is first to market with this capability. AKS customers get it fully managed, out of the box, with built-in dashboards with Log Analytics integration. And the broader Cilium ecosystem gets it through upstream Hubble. High-fidelity visibility. Lower cost. No compromise. Learn more: Container network metrics overview: Container network metrics overview - Azure Kubernetes Service | Microsoft Learn Container network logs overview: Container Network Logs Overview - Azure Kubernetes Service | Microsoft Learn Configure container network metrics filtering: Configure Container network metrics filtering for Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn Set up container network logs: Set up container network logs - Azure Kubernetes Service | Microsoft Learn
63Views0likes0CommentsPublic Preview: Managed Identity support for graphical session recording
Overview Azure Bastion provides secure RDP and SSH access to Azure virtual machines directly via the Azure portal or via the native SSH/RDP client already installed on your local computer. Today, we are introducing public preview for managed identity support for session recording, giving administrators a seamless, identity-based way to authenticate Bastion when writing recordings to a designated storage account. Why Managed Identities? With managed identity support, Bastion authenticates directly to your storage account using an Azure identity, no additional credentials to configure or manage. You can use either a system-assigned or user-assigned managed identity depending on your needs. Authentication is handled automatically through Microsoft Entra ID, which means setup is straightforward: enable the identity, assign a role, and point Bastion at your storage container. For organizations operating at scale across many Bastion deployments and regions, this identity-based approach removes the need to manage credentials, aligns with Zero Trust principles, and lets you control access centrally through Azure RBAC. Getting Started in Azure Portal Prerequisites Ensure that Azure Bastion is deployed with the Premium SKU Ensure that a storage account with a dedicated container for session recordings is created Ensure that the storage account has the required CORS policy configured. Click here to set up the storage account for session recordings Ensure that users who need to view recordings have the Storage Blob Data Reader role on the storage account Steps Navigate to your Bastion resource in the Azure portal. Select Identity (Preview) in the left pane and turn the Status to On to enable a system-assigned managed identity. Wait for the configuration to complete. Select Azure role assignments, then select Add role assignment (Preview). Assign the Storage Blob Data Contributor role scoped to your storage account. Select Save, then navigate to the Configuration blade. Under Session Recording Configuration, select System Assigned Managed Identity and enter the Blob Container URI for your storage container. Navigate to the Session recordings blade to view and play back recorded sessions. Next Steps Learn more about configuring session recording with managed identities here and keep up to date with all things Azure Bastion in our What's New page.184Views0likes0CommentsAnnouncing Microsoft Azure Network Adapter (MANA) support for Existing VM SKUs
As a leader in cloud infrastructure, Microsoft ensures that Azure’s IaaS customers always have access to the latest hardware. Our goal is to consistently deliver technology to support business critical workloads with world class efficiency, reliability, and security. Customers benefit from cutting-edge performance enhancements and features, helping them to future proof their workloads while maintaining business continuity. Azure will be deploying the Microsoft Azure Network Adapter (MANA) for existing VM Size Families. Deployment timeline to be announced by mid-to-late April. The intent is to provide the benefits of new server hardware to customers of existing VM SKUs as they work towards migrating to newer SKUs. The deployments will be based on capacity needs and won’t be restricted by region. Once the hardware is available in a region, VMs can be deployed to it as needed. Workloads on operating systems which fully support MANA will benefit from sub-second Network Interface Card (NIC) firmware upgrades, higher throughput, lower latency, increased Security and Azure Boost-enabled data path accelerations. If your workload doesn't support MANA today, you'll still be able to access Azure’s network on MANA enabled SKUs, but performance will be comparable to previous generation (non-MANA) hardware. Check out the Azure Boost Overview and the Microsoft Azure Network Adapter (MANA) overview for more detailed information and OS compatibility. To determine whether your VMs are impacted and what actions (if any) you should take, start with MANA support for existing VM SKUs. This article provides additional information about which VM Sizes are eligible to be deployed on the new MANA-enabled hardware, what actions (if any) you should take, and how to determine if the workload has been deployed on MANA-enabled hardware.8.4KViews9likes2CommentsDeploying Azure Redis Enterprise with Geo-Replication Using Terraform
This post walks through a production‑proven pattern for running stateful services across Azure regions using Terraform. We’ll cover a primary–replica Redis architecture, regional isolation with Key Vault and networking, and a clean Terraform parameterization strategy that scales from development to production without duplication. Why Multi‑Region State Is Hard Running applications globally is easy when everything is stateless—if something fails, you redeploy. But stateful services tell a different story. Caches, message brokers, and data stores can’t be treated as disposable. They hold business‑critical data, and downtime or inconsistency quickly becomes customer‑visible. In real‑world systems, common requirements include: Low‑latency reads from multiple regions Automatic recovery when a region becomes unavailable Predictable data consistency Repeatable infrastructure from dev through production Manually configuring this per region doesn’t scale. Drift sets in. Failover is unclear. Backups get forgotten. That’s where Terraform + Azure Managed Redis geo‑replication shines. Github Link : https://github.com/vsakash5/Managed-redis.git High‑Level Architecture We use a primary–replica Redis Enterprise model: Primary Redis Single write endpoint Highly available inside its region Source of truth Replica Redis Read‑only Asynchronously synced from primary Can be promoted during disaster recovery Each region is fully isolated: Separate subnets Separate Key Vaults Private Endpoints only (no public exposure) This prevents shared failure domains and allows each region to operate independently if needed. The Terraform Design Principle Instead of maintaining separate Terraform stacks per region, the key idea is: One reusable module, one tfvars file per environment, multiple regions inside it. The module is written once. Regional differences are supplied via parameter suffixes like: _replica _secondary _tertiary This keeps logic centralized and environments consistent. Core Parameter Layers 1. Environment Identity (Shared) Terraform environment = "dev" # dev | staging | prod context_prefix = "app" Show more lines These values are reused everywhere—names, tags, and identifiers. 2. Primary Region Terraform location = "eastus2" resource_group_name = "rg-app-dev-primary" Show more lines 3. Replica Region Terraform location_replica = "uksouth" resource_group_name_replica = "rg-app-dev-replica" The symmetry is intentional. Terraform can now apply the same module twice without branching logic. Regional Isolation: Networking and Secrets Why isolation matters Geo‑replication copies data, not dependencies. If both Redis instances depend on: the same subnet the same Key Vault then a failure in one region can cascade into the other. Networking (One Subnet per Region) Benefits: Independent NSGs Independent routing Independent capacity planning Key Vault (One per Region) Why this matters: Redis credentials are not replicated Each region stores its own secrets A Key Vault outage doesn’t take both regions down Redis Configuration Primary Redis (Writes Enabled) The geo‑replication group name must match. That’s the logical binding Azure uses to link instances. Private Endpoint‑Only Access No Redis instance is exposed publicly. Each region uses: A private endpoint A workload subnet Internal DNS resolution This means: No public IPs No inbound attack surface Traffic stays on the Azure backbone Linking Primary and Replica Terraform explicitly defines the relationship: Terraform managed_redis_geo_replication_config = { primary_to_replica = { primary_redis_key = "primary" replica_keys = ["replica"] } } Terraform ensures: Primary is created first Replica is deployed second Geo‑replication is established last Environment Scaling: Dev → Staging → Prod The infrastructure pattern never changes. Only values do. Environment Group Name Dev dev-grp Staging stg-grp Prod prod-grp This is how you avoid “snowflake” environments. Disaster Recovery Strategy If the primary region fails: Applications fail over to the replica read endpoint Terraform configuration is updated to: Remove geo‑replication Promote replica config to primary Traffic is fully restored Once the original region recovers, roles can be re‑established cleanly. No click‑ops. No guesswork. Key Lessons Learned 1. Naming is Infrastructure Predictable names enable automation, discovery, and auditing. 2. Key Vault Isolation Beats Availability A shared Key Vault is a shared outage. 3. Parameterization Beats Copy‑Paste Fix once → benefit everywhere. 4. Geo‑Replication Is a Contract Matching replication group names is non‑negotiable. 5. The tfvars File Is the Source of Truth If it’s not in Terraform, it’s not real. Final Thoughts Running stateful services in multiple regions doesn’t require magic— it requires discipline: Isolate aggressively Parameterize consistently Automate everything Test failure often With this approach, adding a new region becomes configuration—not redesign. That’s how infrastructure scales.132Views1like0CommentsRethinking Data Modeling: How GitHub Copilot Is Changing the Way We Design Systems
There is a point in every engineer’s journey where data modeling stops feeling intuitive. What starts as a clean schema becomes harder to reason about. Relationships blur. Query paths multiply. Small changes ripple across different steps. This is where data modeling reveals its real challenge: not just storing data but preserving clarity as systems scale. And this is where GitHub Copilot became a practical accelerator for us. WHEN DATA MODELS STOP SCALING WITH OUR THINKING Data modeling is rarely blocked by SQL syntax. It is blocked by architecture decisions. As architects and data engineers, we repeatedly ask: What are the true domain entities versus temporary implementation details? Which relationships must be strict, and which can stay flexible? How do we model for today without breaking six months later? Traditionally, we solved this through whiteboarding, manual DDL rewrites, and slow iteration cycles. It worked, but velocity was low. GitHub Copilot changed that loop. It reduced the distance between intent and a concrete model draft. We used GitHub Copilot Chat in VS Code. Inline suggestions helped during DDL refinement, while Chat mode helped more with architecture reasoning and constraint validation. THE REAL EXAMPLE SaaS control plane: A SaaS control plane is the administrative backbone of a multi-tenant platform. It is separate from the actual application workload (the data plane) and is responsible for tenant lifecycle management: who gets access to what, under which terms, in which environment, and in what deployment state. Think of it as the system that answers the question "who is running what, where, and in what condition" at any point in time. It is used because multi-tenant SaaS products cannot afford to manage tenant provisioning, subscription entitlements, and infrastructure allocation manually at scale. The control plane automates and tracks these operations consistently, reliably, and with full auditability. We used Copilot to accelerate a control-plane style data model where: A tenant is onboarded The tenant selects an application The tenant selects a service tier The tenant selects a target environment A deployment is orchestrated Required infrastructure is provisioned Deployment and infrastructure details are captured for audit and operations This was not a toy schema. Every entity had lifecycle states, ownership boundaries, and downstream consumers in data engineering and observability. THE DIFFICULTIES WE FACED WITHOUT COPILOT Boundary confusion We initially mixed desired state (what should happen) and runtime state (what actually happened). That caused inconsistent query behavior and weak auditability. Relationship drift As deployment history and infra tracking were added, early cardinality assumptions changed. A simple one-to-one became one-to-many after redeployments and rollbacks. Iteration cost Every revision required manual rewrites across table definitions, constraints, and foreign key references. Naming drift became a recurring review issue. Team alignment friction Architecture discussions moved faster than our ability to create review-ready schema drafts. HOW GITHUB COPILOT HELPED Faster first draft We described onboarding and deployment flow in plain language and got a complete relational baseline quickly. A process that previously took long whiteboard sessions and multiple rewrite cycles reached reviewable shape in one focused afternoon. Pattern recall at the right time Copilot consistently surfaced useful structures: A subscription pivot entity to decouple tenant, application, tier, and environment Immutable deployment history records Infra resource catalog with resource typing Deployment event trail for auditability Rapid refinement When assumptions changed, Copilot regenerated affected structures consistently. We did not need to manually patch every dependent reference. Better design conversations Because drafts appeared faster, reviews shifted from syntax fixes to architecture quality: Is tenant isolation explicit? Is deployment history trustworthy? Are infrastructure dependencies traceable? Can data engineering build analytics without fragile joins? CORE RELATIONSHIPS FINALIZED Tenant to Tenant App Subscription: one-to-many Application to Tenant App Subscription: one-to-many Service Tier to Tenant App Subscription: one-to-many Environment to Tenant App Subscription: one-to-many Tenant App Subscription to Deployment: one-to-many Deployment to Infra Resources: one-to-many Deployment to Deployment Events: one-to-many LOGICAL RELATIONSHIP VIEW Tenant -> Tenant App Subscription Application -> Tenant App Subscription Service Tier -> Tenant App Subscription Environment -> Tenant App Subscription Tenant App Subscription -> Deployment Deployment -> Infra Resources Deployment -> Deployment Events SAMPLE PROMPTS WE USED Prompt 1 Design a relational schema for a multi-tenant onboarding flow where tenants select application, service tier, and environment, followed by deployment and infrastructure tracking. Include relationship rationale. Prompt 2 Given this schema, identify possible cardinality mistakes and suggest safer constraints for repeat deployments. Prompt 3 Refine the model so deployment records are immutable history, while subscription stores current desired state. Prompt 4 Suggest naming and normalization improvements so data engineers can build reliable analytics models from these tables. Prompt 5 List operational risks in this model and which table-level attributes improve troubleshooting and auditability. Prompt 6 Tenant license generation is now required before a tenant can access a selected application. Propose a minimal schema change that captures license issuance, validity period, and status without breaking existing subscription and deployment records. THE THOUGHT PROCESS SHIFT Before Copilot, most effort went into translating architecture ideas into SQL artifacts. With Copilot, that translation became fast enough that evaluation became the primary activity: Is tenant isolation explicit in schema or hidden in app logic? Is deployment history append-only and safe? Are infra resources independently traceable? Can analytics teams consume this model cleanly? That shift from authoring to evaluating is where design quality improved. SCHEMA EVOLUTION LESSONS Two changes arrived mid-design: License generation became a required step before a tenant could access a selected application. The license needed to capture issuance date, validity period, and revocation status. Service-tier entitlements became time-variant In both cases, we described the evolution requirement in Copilot Chat with the existing schema as context. Copilot proposed additive changes: a new tenant license table linked to the subscription for the first case, with license status and validity tracked independently, and a separate tier entitlement history table for the second. WHERE ENGINEERING JUDGMENT STILL MATTERS Copilot gives possibilities, not guarantees. Engineers still need to own: Domain truth and lifecycle semantics Cardinality correctness Performance and scale trade-offs Governance, compliance, and retention Backward-compatible migration strategy Copilot output should always go through architecture review before production. CLOSING THOUGHT Data modeling is still an architectural discipline. What changed is iteration speed. GitHub Copilot helps teams move from idea to structure faster, compare alternatives earlier, and focus on design quality rather than mechanical authoring. The bottleneck is no longer writing schemas. It is thinking clearly about the problem. That is where AI-assisted modeling delivers real value. TRY IT YOURSELF Open GitHub Copilot Chat in VS Code Describe your core entities and lifecycle questions Ask for at least two schema alternatives Compare constraints and operational trade-offs Share your lessons with the communityAKS on AzureLocal: KMSv1 -> KMSv2
Hey, quick question on AKS Arc — we're running moc-kms-plugin:0.2.172-official on an Arc-enabled AKS cluster on Azure Local and currently have KMSv1=true as a feature gate to keep encryption at rest working. KMSv1 is deprecated in 1.28+ and we want to migrate to KMSv2 before it gets removed. Since moc-kms-plugin is a Microsoft-managed component we can't just swap it out ourselves. A few questions: Does version 0.2.172 already support the KMSv2 gRPC API, or is that coming in a later release? Is there a supported migration path for AKS Arc specifically, or does this come automatically through a platform update? Any docs or internal guidance you can point us to? Thanks!42Views0likes1CommentAzure Monitor Pipeline: A Modern Approach to Telemetry Ingestion at Scale
Large-scale observability often fails not because of analytics, but because of ingestion. As environments grow—across regions, clouds, and on‑premises sites—traditional log forwarding architectures struggle with scale, reliability, cost, and security. Azure Monitor Pipeline, now generally available, addresses these challenges by rethinking how telemetry enters Azure Monitor. The Ingestion Problem Architects Know Too Well At enterprise scale, telemetry pipelines hit familiar limits: Throughput constraints: Traditional forwarders drop events during spikes, especially beyond tens of thousands of events per second. Network fragility: Connectivity interruptions lead to permanent data loss, creating blind spots in security and operations. Rising costs: Shipping all logs—signal and noise alike—drives ingestion and storage costs without improving insight. Schema inconsistency: Heterogeneous log formats require constant downstream parsing and maintenance. Operational sprawl: Managing agents, certificates, and configs on thousands of hosts becomes unmanageable. These issues compound in hybrid and multi-site environments, where centralized visibility is most critical. What Azure Monitor Pipeline Changes Azure Monitor Pipeline introduces a centralized telemetry ingestion layer that sits between sources and Azure Monitor. Rather than deploying agents everywhere, architects deploy pipelines strategically—per region, data center, or network segment—to aggregate and process telemetry before it reaches the cloud. Key capabilities include: Area Traditional Forwarding Azure Monitor Pipeline Scale Limited vertical scaling Horizontal scaling to hundreds of thousands or millions of events/sec Resilience In‑memory buffering Persistent disk buffering with automatic backfill Data quality Manual parsing Automatic schematization into Azure-native tables Cost control Post-ingestion filtering Pre-cloud filtering, aggregation, enrichment Security Per-host cert management Centralized TLS/mTLS with automated rotation Architectural Implications Design as infrastructure, not an agent. Treat the pipeline like regional ingestion infrastructure—similar to an API gateway. Common patterns include hub‑spoke deployments in Azure, edge aggregation at branch sites, or hybrid topologies with on‑premises buffering. Plan for failure. Persistent buffering ensures telemetry is retained during outages and replayed automatically, preserving audit trails and compliance continuity. Optimize at the edge. Filtering and sampling before ingestion can reduce volume by 40–70% while retaining high‑value signals. Drop low‑value logs, sample routine success paths, and keep all errors and security events. Standardize schemas early. Automatic mapping to Azure-native tables eliminates downstream parsing and reduces detection breakage, especially in Microsoft Sentinel scenarios. Scale horizontally. Kubernetes-based scaling allows pipelines to absorb traffic spikes predictably, supported by sizing guidance for capacity planning. When to Use It Azure Monitor Pipeline is a strong fit for high-volume security telemetry, hybrid or multi-site environments, and cost-sensitive observability platforms. For Azure VMs or AKS application monitoring, native Azure Monitor Agent or AKS OTLP ingestion may be simpler. The key is choosing the ingestion path that matches your compute and operational model. The Bigger Picture Built on OpenTelemetry components, Azure Monitor Pipeline aligns Azure’s observability strategy with open standards, improving portability and ecosystem compatibility. For architects managing telemetry at enterprise scale, it provides a robust, secure, and cost-aware foundation—solving the hardest part of observability before data ever reaches the cloud.201Views0likes0CommentsIngest at Scale, Securely — Azure Monitor pipeline Is Now Generally Available
Today, we're thrilled to announce the general availability of Azure Monitor pipeline — a telemetry pipeline built for secure, high-scale ingestion across any environment. But the best way to understand what makes it powerful isn't to start with features. It's to start with the problems that kept showing up, over and over, in our conversations with customers. So, let's dig in... Chances are, this sounds a lot like your environment Imagine a large enterprise rolling out Microsoft Sentinel as their SIEM. They have sites across regions, a mix of on‑premises and cloud environments, and security telemetry streaming in from firewalls, network devices, and Linux servers—100,000 to 1 million events per second in some locations. Traditional forwarders buckle under the load, drop events during network blips, and ship everything – signal and noise – straight into Sentinel. The result: skyrocketing ingestion costs, degraded detections, and a brittle forwarding infrastructure that demands constant babysitting. If you're managing environments like these, these questions are probably top of mind: How do I securely ingest telemetry—without opening hundreds of risky endpoints? How do I reduce ingestion costs when telemetry spikes across thousands of sources simultaneously? How do I centrally standardize logs across sites and device types before they ever reach Azure? What happens to telemetry from an entire location when connectivity drops? And how do I do all of this consistently, at massive scale, and centrally across environments instead of configuring each host individually? These aren't edge cases. For many teams, getting data into the system itself is the hardest part of observability —and by the time telemetry reaches Azure Monitor or Sentinel, it's already too late to fix these problems. Customers need control before the data hits the cloud. What is Azure Monitor pipeline (and why it’s different)? Azure Monitor pipeline provides a centralized control point for telemetry ingestion and transformation, designed specifically for secure, high‑throughput, enterprise‑scale scenarios. It's built on open-source technologies from the OpenTelemetry ecosystem and includes the components needed to receive telemetry from local clients, process that telemetry, and forward it to Azure Monitor. It’s not another agent. And NO, you do not need to install it on all the resources… Agents such as Azure Monitor agent are great for collecting telemetry from individual machines and services. Azure Monitor pipeline solves a different problem: “How do I ingest telemetry from across my environment through a centralized pipeline – instead of configuring each host – while maintaining control over reliability, security, and ingestion cost?” With Azure Monitor pipeline control, you can: Ensure logs land directly in Azure‑native schemas – automatic schematization into tables such as Syslog and CommonSecurityLog Prevent data loss during intermittent connectivity across sites – local buffering in persistent storage with automated backfill Reduce ingestion costs before data reaches the cloud – centralized filtering, aggregation, and transformation Ingest telemetry at sustained high volumes in the range of hundreds and thousands of events per second – horizontally scalable pipeline architecture Secure telemetry ingestion without managing certificates on each host individually – centralized TLS/mTLS with automated certificate provisioning and zero‑downtime rotation Maintain visibility into ingestion infrastructure health – pipeline performance and health monitoring Plan deployments confidently at large scale – infrastructure sizing guidance for expected telemetry volume And all of this is fully supported and production‑ready in GA. Learn more. So, let's talk a little bit about these in detail! Tired of broken detections because logs don't match your table schema? - Automatic schematization (a customer favorite!) A consistent theme from preview customers was how painful it is to deal with log formats. Azure Monitor pipeline is the only solution that automatically shapes and schematizes data, so it lands directly in standard Azure tables such as Syslog and CommonSecurityLog. Learn more. That means: No custom parsing pipelines downstream No broken detections due to schema drift Faster time to value for security teams This happens before data reaches the cloud – right where it matters most. What happens to my telemetry when the network goes down? - Local buffering in persistent storage and automated backfill Networks fail. Maintenance happens. Sites go offline. Azure Monitor pipeline is built for this reality. It buffers telemetry locally in your configured persistent storage during network interruptions and automatically backfills data when connectivity is restored. Learn more. The result: No gaps in security visibility No manual replays Confidence that critical telemetry isn’t lost How do I reduce ingestion costs without sacrificing signal quality? - Filter and aggregate at the edge Nobody likes to pay for the data that they do not need... With Azure Monitor pipeline, customers can filter, aggregate, and shape the telemetry at the edge, sending only high‑value data to Azure. Learn more. This helps teams: Reduce ingestion costs Improve detection quality Keep cloud analytics focused on signal, not volume Cost optimization and signal quality are no longer trade‑offs – you get both. How do I keep up when telemetry volumes spike to hundreds of thousands of events per second? - Scaling One of the biggest pain points we hear is scale. Azure Monitor pipeline is designed for sustained high throughput ingestion, scaling horizontally and vertically to handle hundreds of thousands to millions of events per second. Learn more. This isn’t about theoretical limits; it’s about handling the real-world extremes that break traditional forwarders. How do I send telemetry in a secure manner? - Secure ingestion with TLS and mTLS Security teams consistently tell us that plain TCP ingestion just isn’t acceptable – especially in regulated environments. Azure Monitor pipeline addresses this head‑on by providing TLS‑secured ingestion endpoints with mutual authentication, ensuring telemetry is encrypted in transit and accepted only from trusted sources. Learn more. The result: Secure ingestion at the boundary by encrypting data in transit using TLS with automated certificate provisioning and zero downtime rotation. Clients and Azure Monitor pipeline endpoints both validate each other before ingestion by enabling mutual authentication with mTLS, and it’s easy to set it up with our default experience. Do you have your own PKI and certificate management systems? - Feel free to bring your own certificates to enable secure ingestion. If the pipeline is this critical — how do I know it's healthy? One thing we heard loud and clear during preview: “If this pipeline is critical, I need to see how it’s doing.” Azure Monitor pipeline now exposes health and performance signals, so it’s no longer a black box. Learn more. Customers can answer questions like: Is my pipeline receiving, processing, and sending telemetry? What’s the CPU and memory usage of each pipeline instance? Why is a pipeline unhealthy—or down? Observability for observability felt like the right bar to meet. How do I plan infrastructure without over- or under-provisioning? Planning pipeline infrastructure shouldn't be a guessing game – and we heard this loud and clear during preview. GA includes clear sizing guidance to help you plan the right infrastructure based on your expected telemetry volume and workload characteristics. Not rigid formulas, but practical starting points that give you a confident baseline so you can design intentionally, deploy faster, and avoid costly over- or under-provisioning. Learn more. Alright, these are a bunch of exciting features. How much do I need to pay for them? Azure Monitor pipeline is included at no additional cost for ingesting telemetry into Azure Monitor and Microsoft Sentinel. With general availability, Azure Monitor pipeline is production-ready so you can run the most demanding ingestion scenarios with confidence. If you’re already using it in preview, welcome to GA. If you’re just getting started, there’s never been a better time to dive in. As always, your feedback is what drives this forward. Drop a comment below, reach out directly, or share what you're building. We'd love to hear from you.823Views2likes0Comments🆕 Microsoft Q&A Platform Update
We’re excited to introduce a new capability on Microsoft Q&A that provides contributors with greater visibility into their activity and impact. 📊 Contributor Analytics Dashboard The Contributor Analytics Dashboard is now available on Microsoft Q&A within Microsoft Learn, offering a self-serve experience to track contributions and engagement. This new experience was built to improve transparency and make it easier for contributors to understand the value of their participation across the platform. 🔎 How to access Click your Learn profile avatar Select Analytics 📈 What you can track The dashboard surfaces key metrics to help you measure your impact: Number of answers provided Acceptance rate Views and votes Activity trends over time Content engagement signals These insights allow contributors to better understand performance and track progress over time. ✅ Who can use it The dashboard is currently available to contributors who have answered at least one question on Microsoft Q&A. 💡 Why this matters This update brings: Increased visibility into your contributions A centralized view of your activity (no more manual tracking) Better understanding of your impact across the community Stronger recognition and motivation for continued contributions 🔄 What’s next The experience will continue to evolve, with additional enhancements planned based on contributor feedback. 🙌 For Product Champions This dashboard is especially valuable as you: Track your growth within the program Measure impact for recognition and rewards Monitor your contribution consistency across product tracks As our program grows, transparency like this helps ensure that quantity transforms into quality and that your contributions are clearly recognized.